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INTRODUCTION 


1.1. INTRODUCTION 


The DSP56100 Family Manual (see Figure 1-1) provides a description of the components 
that are common to all DSP56100 family processors and includes a detailed description 
of the basic DSP56100 family instruction set. The DSP56156 User’s Manual and 
DSP56166 User’s Manual provide a brief overview of the core processor and a detailed 


descriptions of the memory and peripherals that are chip specific. 
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Figure 1-1 DSP56100 Family Product Literature 
A DSP561xx User’s Manual and a DSP561xx Technical Data Sheet will be available for 
any future DSP56100 family member. 
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1.2 DSP56100 FAMILY FEATURES 


The DSP56100 family consists of programmable CMOS 16-bit Digital Signal Processor 
core composed of a 16-bit arithmetic DATA ALU (DALU), Address Generation Unit 
(AGU), Program Controller Unit (PCU), and their associated DSP instruction set. 


Table 1-1 gives a description of the DSP Core features. 


Table 1-1 DSP Core Feature List 


Up to 30 Million Instructions per Second (MIPS) at 60 MHz.— 33.3 ns instruction cycle 
Single-cycle 16 x 16-bit parallel multiply-accumulate 

2 x 40-bit accumulators with extension byte 

Fractional and integer arithmetic with support for multiprecision arithmetic 

Highly parallel instruction set with unique DSP addressing modes 

Nested hardware DO loops including infinite loops 

Two instruction LMS adaptive filter loop 

Fast auto-return interrupts 

Three external interrupt request pins 

Three 16-bit internal data buses and three 16-bit internal address buses 
Programmable access time on the external bus 

On-chip peripheral registers memory mapped in data memory space 

Off-chip peripheral space with programmable access time memory mapped in data memory space 
Low power wait and stop modes 

On-Chip Emulation (OnCE) for unobtrusive, processor speed independent debugging 
Operating frequency down to DC 

Single power supply 

Low power (HCMOS) 


The block diagram of the core processor used in the DSP56100 family is shown in Figure 
1-2. 
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Figure 1-2 DSP56100 Family Core CPU Block Diagram 


The amount and type of on-chip memory varies from chip to chip within the family and so 
is not discussed here. However, the architecture allows up to 64K words each (128k total) 
of program memory and data memory to be addressed. 


The peripherals and options that can be incorporated on-chip include: 


* A Byte-wide Host Port 

¢ Synchronous Serial Ports 

¢ General Purpose I/O Pins 

* Timer With External Access 
* A Codec 

¢ On-chip Oscillator 

¢ Interrupt Request Pins 


Other peripherals will be designed for new DSP56100 Family members. 
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2.1. INTRODUCTION 

The heart of the DSP56100 architecture is a 16-bit multiple-bus processor designed spe- 
cifically for real-time digital signal processing (DSP). The overall architecture is presented 
and detailed block diagrams of the Data ALU and Address ALU architecture are de- 
scribed. 


2.2 DSP56100 BLOCK DIAGRAM 
The major components of the CPU are: 


¢ Data Buses 

« Address Buses 

¢ Data ALU 

¢ Address ALU 

¢ Program Control and System Stack 


An overall block diagram of the CPU architecture is shown in Figure 2-1. 


2.2.1 Data Buses 

Data movement on the chip occurs over three bidirectional 16-bit buses: the X Data Bus 
(XDB), the Program Data Bus (PDB), and the Global Data Bus (GDB). Data transfer be- 
tween the Data ALU and the X Data Memory occurs over the XDB when one memory ac- 
cess is performed, over the XDB and the GDB when two simultaneous memory reads are 
performed. All other data transfers occur over the GDB. Instruction word pre-fetches take 
place in parallel over the PDB. The bus structure supports general register to register, reg- 
ister to memory, memory to register, and memory to memory data movement and can 
transfer up to three 16-bit words in the same instruction cycle. Transfers between buses 
are accomplished through the Internal Bus Switch. 


As a general rule, when reading any 8-bit register, the unused bits in the most significant 
byte are zero filled and any unused or reserved bits are read as zero. 


2.2.2 Address Buses 

Addresses are specified for internal X Data Memory on two unidirectional 16-bit buses, X 
Address Bus One (XAB1) and X Address Bus Two (XAB2). Program memory addresses 
are specified on the bidirectional Program Address Bus (PAB). 


When external memory spaces have to be addressed, a single 16-bit unidirectional ad- 
dress bus driven by a three input multiplexer can select: XAB1, XAB2, or the PAB. One 
instruction cycle is needed for each external memory access. There is no speed penalty 
if only one external memory space is accessed in an instruction and if no wait states are 
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inserted in the external bus cycle. If two or three external memory spaces are accessed 
in a single instruction, there will be a one or two instruction cycle execution delay, respec- 
tively, or more if wait states are inserted on the external bus. A bus arbitrator controls ex- 
ternal accesses, making it transparent to the user. 


2.2.3 Internal Bus Switch 

Transfers between buses are accomplished in the Internal Bus Switch. The internal bus 
switch is similar to a switch matrix and can connect any two internal buses without adding 
any pipeline delays. 


2.2.4 Bit Manipulation Unit 

The bit manipulation unit performs bit manipulation and bit field manipulation on memory 
words and register data. It is capable of testing and/or changing a user selected set of bits 
within a byte. 


2.2.5 Data ALU (DALU) 

The Data ALU performs all of the arithmetic and logical operations on data operands. The 
Data ALU consists of four 16-bit input registers, two 32-bit accumulator registers, two 8- 
bit accumulator extension registers, an accumulator shifter, an output shifter, one data 
bus shifter/limiter, and a parallel single cycle non-pipelined Multiply-Accumulator (MAC) 
unit. Data ALU registers may be read or written by the XDB and GDB as 16-bit operands. 
The Data ALU is capable of multiplication, multiply-accumulate with positive or negative 
accumulation, addition, subtraction, shifting, and logical operations in one instruction cy- 
cle. Data ALU arithmetic operations generally use fractional 2’s complement arithmetic. 
Some signed/unsigned and integer operations are also possible. Data ALU source oper- 
ands may be 16, 32 or 40 bits and may originate from input registers and/or accumulators. 
ALU results are always stored in one of the accumulators. The upper 16-bits of an accu- 
mulator can be used as a multiplier input. Arithmetic operations always have a 40-bit re- 
sult and logical operations are performed on 16-bit operands yielding 16-bit results in one 
of the two accumulators. Refer to Section 3 for a detailed description of the Data ALU ar- 
chitecture. 


2.2.6 Address Generation Unit (AGU) 

The AGU performs all address storage and effective address calculations necessary to 
address data operands in memory. This unit operates in parallel with other chip resources 
to minimize address generation overhead. The AGU can implement three types of arith- 
metic: linear, modulo, and reverse carry. The Address ALU contains four Address Regis- 
ters (RO-R3), four Offset Registers (NO-N3), and four Modifier Registers (MO-M3). The 
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Figure 2-1 Architecture of the 16-Bit DSP CPU 


Address Registers are 16-bit registers which may contain address or data. Each Address 
Register may be output to the PAB and XAB1. R3 may be accessed for output to XAB2 
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when RO, R1, or R2 are output to XAB1. The modifier and offset registers are 16-bit reg- 
isters which are normally used to control updating of the address registers. Offset regis- 
ters can also be used as 16-bit data general purpose registers. 


AGU registers may be read or written by the GDB as 16-bit operands. The AGU can gen- 
erate two 16-bit addresses every instruction cycle: one for either the XAB1 or PAB and 
one for XAB2. The ALU can directly address 65536 locations on the XAB and 65536 lo- 
cations on the XAB2 bus - a total capability of 131,072 16-bit data words. Refer to Section 
4 for a detailed description of the AGU architecture. 


2.2.7 X Data Memory 

The On-Chip X Data Memory addresses are received from the XAB1 and XAB2 and data 
transfers occur on the XDB and GDB. Two reads or one write can be performed during 
one instruction cycle on the internal data memory. The on-chip peripherals occupy the top 
64 locations in the X data memory space (X:$FFCO-X:$FFFF). X memory may be expand- 
ed off-chip for a total of 65,536 addressable locations. 


2.2.8 Program Memory 

The On-Chip Program Memory addresses are received from the program control logic 
(usually the program counter) or from the address ALU on the PAB. The first 64 locations 
of the program memory are reserved for interrupt vectors. The program memory may be 
expanded off-chip for a total of 65,536 addressable locations. 


2.2.9 Bootstrap Memory 
A program bootstrap ROM is only read by the program controller while in the bootstrap 
mode, during which, the on-chip program RAM is defined as write-only. 


2.2.10 Program Control Unit (PCU) and System Stack (SS) 

The Program Control Unit performs instruction prefetch, instruction decoding, hardware 
loop control and exception processing. It contains six, 16-bit directly addressable regis- 
ters. They are the: 


Program Counter (PC), 

Loop Address (LA), 

Loop Count (LC), 

Status Register (SR), 

Operating Mode Register (OMR), 
Stack Pointer (SP). 


2 ot ae 
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The System Stack is a separate internal RAM 15 locations “deep” which stores the PC 
and the SR for subroutine calls and long interrupts. The stack will also store the LC and 
the LA in addition to the PC and SR registers for program looping. 


2.2.11 External Bus Interface 
A common address bus is used to access external Data Memory, Program Memory, or 
I/O devices when required. Separate select lines control access to the memory spaces. 


MOTOROLA eu ARCHITECTURE OVERVIEW 2-7 


For More Information On This Product 
Go to: www.freescale.com 


| DSP56100 BLOCK DIAGRAM 


2-8 CPU ARCHITECTURE OVERVIEW MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


Freescale Semiconductor tne ——_ 


SECTION 3 


DATA ALU 


MOTOROLA DATA ALU 3-1 


For More Information On This Product 
Go to: www.freescale.com 


SECTION CONTENTS 

oy OVERVIEW AND ARCHITECTURE ...5 <2 sn2 ese ee ee ee ye ees: 3-3 
3.1.1 Data ALU Input Registers (X1, X0, Y1, YO) ............-......- 3-4 
3.1.2 Data ALU Accumulator Registers (A2, A1, AO, B2,B1, BO) ........ 3-4 
3.1.3 Multiply-Accumulator (MAC) and Logic Unit .................... 3-6 
3.1.3.1 Multiply-Accumulator (MAC) Array and Logic unit ............... 3-7 
3.1.3.2 ZB NUIMPIOXeh Sot some eit ee ee ee iene yye ee eee aes 3-7 
3.1.3.3 Multiplier Control Recoder (REC) ............-.-..00--0000- 3-8 
3.1.3.4 ExIenSlOneAGGer (EA) a cece eee arse sear veeeea ae 3-8 
3.1.4 Accumulator shitien (AS)) cece se eee see eee ee eee eee 3-8 
15 @UIDUL SINGH OS ee ese cere rep ert aes tie ner an earners 3-9 
3.1.6 Batar omits EMens sm seca setae eres eee an ne eeu C Aen Sone 3-9 
3.1.6.1 S767 | [| 16 Pambansa ereee een aga at ae erie cesar ea ce tana oreerape tures Oe ieee cee para aioe 3-9 
3.1.6.2 LITT ICUER Clipe rae en mewn as ee eae ren arena aac uecny cone casa age tenet, re wae Pee ree 3-9 
3.2 THE DATA ALU ARITHMETIC AND ROUNDING ............... 3-10 
3.2.1 Data Representation. aeeetews footer ene emcee ee eee ns 3-10 
3.2.2 Practiomel@Atitiieneiece karae:-etvey fone cts rei eevee aos aang eet 3-11 
3.2.3 Integer: AntMMNGHIC: 2 ceree ncn cto eaters een ee Care Ne 3-12 
3.2.4 Multiprecision Arithmetic Support .............-.-..00 02 eee eee 3-14 
3.2.5 MOUWMCING VIOUS ere ata mrcrta eet etree sectens urna acer een een ee 3-15 
3.2.5.1 CORVErde nt ROUNCING#: cove custata ce ns en a cnr 3-15 
S252 Two’s Complement Rounding ..............-...0-0+- eee e ee eee 3-18 
3-2 DATA ALU MOTOROLA 


\ For More Information On This Product, y 
Go to: www.freescale.com 


| OVERVIEW AND ARCHITECTURE 


3.1. OVERVIEW AND ARCHITECTURE 


This Section describes the structure and the operation of the Data ALU registers and 
hardware in addition to describing the data representation, rounding, and saturation 
arithmetic used within the Data ALU. 


The major components of the Data ALU are 


* Data ALU Input Registers 

¢ Data ALU Accumulator Registers 

* Aparallel single cycle non-pipelined Multiply-Accumulator (MAC) Unit 

¢ An Accumulator Shifter (AS) 

¢ An Output Shifter (OS) 

¢ A Data Shifter/Limiter (S/L) 
A block diagram of the Data ALU architecture is shown in Figure 3-1 and a functional 
block diagram is shown in Figure 3-2. 
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Figure 3-1 Data ALU Architecture Block Diagram 
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3.1.1 Data ALU Input Registers (X1, X0, Y1, YO) 


X1, XO, Y1, and YO are 16-bit latches which serve as input registers for the data ALU. 
Each register may be read or written by the XDB as well as the GDB. XO, X1, YO, and Y1 
may be read over the XDB. They may be treated as four independent 16-bit registers or 
as two 32-bit registers called X and Y which are developed by concatenating X1:X0 and 
Y1:Y0 respectively (where X1 and Y1 are the most significant words and XO and YO are 
the least significant words in X and Y respectively). 


These Data ALU input registers are used as source operands for most data ALU opera- 
tions and allow new operands to be loaded for the next instruction while the register con- 
tents are used by the current instruction. 


3.1.2 Data ALU Accumulator Registers (A2, A1, AO, B2, B1, BO) 


A1, AO, B1 and BO are 16-bit latches which serve as data ALU accumulator registers. A2 
and B2 are 8-bit latches which serve as accumulator extension registers. Each register 
may be read or written by the XDB as a word operand. Ai and B1 may be read or written 
by the GDB. When A2 or Bz2 is read, the register contents occupy the low-order portion 
(bits 7-0) of the word; the high-order portion (bits 16-8) is sign-extended. When A2 or B2 
is written, the register receives the low-order portion of the word; the high-order portion is 
not used. 


The accumulator registers are treated as two 40-bit registers A (A2:A1:A0) and B 
(B2:B1:B0) for data ALU operations. These accumulator registers receive the 
EXT:MSP:LSP portion of the Multiply-Accumulator unit output and supply a source accu- 
mulator of the same form. Most data ALU operations specify the 40-bit accumulator reg- 
isters as source and/or destination operands 


The accumulator registers are treated as two 40-bit registers A (A2:A1:A0) and B 
(B2:B1:B0) for data ALU operations. These accumulator registers receive the 


EXT:MSP:LSP portion of the Multiply-Accumulator unit Output and supply a source accu- 
mulator of the same form. Most data ALU operations specify the 40-bit accumulator reg- 


isters as source and/or destination operands. 


When one accumulator is used as a multiplier input, only the upper portion (A1 or B1) 
can be specified. This upper portion can also be directly used as an address register for 
fast effective address computation. 


Automatic sign extension of the 40-bit accumulators is provided when the A or B register 
is written with a smaller size operand. This can occur when writing A or B from the X data 
bus or with the results of certain data ALU operations (such as Tcc or TFR). If a word 
operand is to be written to an accumulator register (A or B), the MSP portion of the accu- 
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Figure 3-2 Data ALU Functional Block Diagram 


mulator is written with the word operand, the LSP portion is zeroed and the EXT portion 
is sign-extended from MSP. No sign extension is performed if an individual 16-bit register 
(A1, AO, B1, or BO) is written. 


The extension registers A2 and B2 offer protection against 32-bit overflow. When the 
result of an accumulation crosses the MSB of MSP (bit 15 of A1 or B1), the extension bit 
of the status register (E bit) is set. Up to 255 overflows or underflows are possible using 
this extension byte, after which the sign is lost beyond the MSB of the EXT register, set- 
ting the overflow bit (V bit) in the status register. 


It is also possible to saturate the accumulator on a 32-bit value automatically after every 
accumulation. This is done by setting the saturation bit in the Operating Mode Register 
(OMR). The highest dynamic range of the machine is limited to 32 bits then, and the lim- 
iting bit (L bit) in the status register is set by the saturation. 


MOTOROLA DATA ALU 3-5 


For More Information On This Product 
Go to: www.freescale.com 


| OVERVIEW AND ARCHITECTURE 


The detection of the overflow logic is also used to saturate an accumulator out of the 
shifter/limiter register while reading A or B accumulators over the XDB or transferring 
them to any data ALU register. The content of A or B is not affected in that case (except 
when the same accumulator is specified as source and destination); only the value trans- 
ferred over the XDB is limited to a full-scale positive or negative 16-bit value ($7FFF or 
$8000), respectively. This overflow protection is performed after the contents of the 
accumulator have been shifted according to the scaling mode defined in the status regis- 
ter. When limiting occurs, the L bit flag in the status register is set and latched. Note that 
only when an entire 40 bit accumulator register (A or B) is specified as the source for a 
parallel data move over the XDB will shifting and limiting be performed. Shifting and lim- 
iting are not performed when AO, A1, A2, BO, B1, or B2 are individually specified. 


3.1.3 Multiply-Accumulator (MAC) and Logic Unit 


The MAC and logic unit is the main arithmetic processing unit of the DSP and performs 
all of the calculations on data operands. The MAC unit accepts up to three input oper- 
ands and outputs one 40-bit result of the form Extension:Most Significant Product: Least 
Significant Product (EXT:MSP:LSP). The operation of the MAC unit occurs indepen- 
dently and in parallel with XDB, GDB, and PDB activity. The Data ALU registers provide 
pipelining for both data ALU inputs and outputs. Latches are provided on the MAC unit 
input to permit writing an input register which is the source for a Data ALU operation in 
the same instruction. All ALU operations occur in one instruction cycle. The inputs of the 
multiplier can come from the X and Y registers (X1, XO, Y1, YO) as well as from the MSP 
of each accumulator (A1, B1). The multiplier executes 16 x 16-bit parallel signed/ 
unsigned fractional and signed integer multiplies. 


For fractional arithmetic, the 31-bit product is added to the 40-bit contents of either the A 
or B accumulator. The 40-bit sum is stored back in the same accumulator. This multiply/ 
accumulate is a single cycle operation (no pipeline). Integer operations always generate 
a 16-bit result located in the accumulator MSP portion (A1 or B1). Full precision integer 
operations are possible using an ASR instruction after any fractional MPY or MAC. 


If a multiply without accumulation is specified in the instruction, the MAC clears the accu- 
mulator and then adds the contents to the product. The results of all arithmetic instruc- 
tions are valid (sign extended and zero filled) 40-bit operands in the form EXT:MSP:LSP, 
A2:A1:A0, or B2:B1:BO (except during integer operations). When a 40-bit result is to be 
stored as a 16-bit operand, the LSP can simply be truncated or it can be rounded into the 
MSP. The rounding performed is either convergent rounding (Round to the nearest 
even) or twos-complement rounding. The type of rounding is specified by the rounding 
bit in the status register. The bit in the accumulator which is rounded is specified by the 
scaling mode bits in the status register. 
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The major components of the MAC unit are 


¢ Multiply-Accumulator Array 
¢ ZB Multiplexer 

¢ Multiplier Control Recoder 
¢ Extension Adder 

* Logic unit 


3.1.3.1 Multiply-Accumulator (MAC) Array and Logic unit 


The multiply-accumulator array is a 16 X 16-bit asynchronous, parallel multiply-accumu- 
lator with 40-bit accumulation. The MAC array is based on the modified Booth’s algo- 
rithm. The MAC array is used in all arithmetic operations. The array performs signed and 
unsigned arithmetic with a fractional data representation and signed arithmetic with an 
integer data representation. The MAC array also performs rounding if specified in the 
DSP instruction. The type of rounding is specified by the scaling mode bits and the 
rounding bit in the status register. 


Three input operands are received on six internal data buses AS2, AS1, ASO, EB, ZB, 
and MB. The AS2:AS1:ASO data bus is the 40-bit source accumulator bus and repre- 
sents the EXT:MSP:LSP portion of the source accumulator. The AS2:AS1:ASO bus is 
the output of the accumulator shifter. The ZB data bus is a 16-bit input operand used in 
most data ALU operations and represents the multiplicand in multiplication operations. 
The MB data bus is a 16-bit input operand which represents the multiplier in multiplica- 
tion operations. The ZB and MB buses are concatenated (ZB:MB) to form a 32-bit input 
bus for long word operands. The EB bus is concatenated with the ZB and MB buses 
(EB:ZB:MB) to form a 40-bit input bus for addition or subtraction of the two full accumula- 
tors. 


The logic unit in the MAC array performs the logical operations AND, OR, EOR, and 
NOT on data ALU registers. The logic unit is 16 bits wide and operates on data in the 
MSP portion of the accumulator. The LSP and EXT portions of the accumulator are not 
affected. 


3.1.3.2 ZB Multiplexer 


The ZB Multiplexer sign extends, by one bit, the data coming into the MAC over the ZB 
bus. This sign bit can be cleared by the ZB Multiplexer to obtain an unsigned format for 
these operands. The ZB Multiplexer may also invert data coming into the MAC as 
required. 
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3.1.3.3 Multiplier Control Recoder (REC) 


The multiplier control recoder directs the operation of the MAC array and performs multi- 
plier operand recoding for the modified Booth’s algorithm multiplication. The MB bus is 
the input to the multiplier control recoder. Data-independent multiplier control line gener- 
ation is performed in the REC for most non-multiplication instructions. For example, the 
multiplier control output for a data ALU addition would be a multiplication by +1 opera- 
tion. For other data ALU operations, the multiplier control recoder generates control line 
constants that do not correspond to a valid multiplier control word. The least significant 
recoder outputs a zero control word and the most significant recoder provides all the 
functions in these cases. 


3.1.3.4 Extension Adder (EXA) 


EXA is an 8-bit adder which serves as an extension accumulator for the MAC array. The 
primary source operand is the AS2 internal data bus from the accumulator shifter. For 
multiply-accumulate operations, the second source operand is an update constant gen- 
erated from the carry and overflow outputs of the MAC array. For 40-bit additions or sub- 
tractions, the EB internal data bus is used as the second source operand. This allows the 
two accumulators to be added and subtracted from each other. The extension adder out- 
put is the EXT portion of the MAC unit output and is the sum of the source operands. 
3.1.4 Accumulator Shifter (AS) 


The accumulator shifter is an asynchronous parallel shifter with a 40-bit input and a 40- 
bit output. The source accumulator shifting operations are: 


1. No Shift (Unmodified) 

1-Bit Left Shift (Arithmetic) ASL 

1-Bit Right Shift (Arithmetic) ASR 
4-Bit Right Shift (Arithmetic) ASR4 
4-Bit Left Shift (Arithmetic) ASL4 
16-Bit Right Shift (Arithmetic) ASR16 


oO oe oe 


7. Force to zero 


The shifter also performs a 15-bit arithmetic shift to the right during integer multiply-accu- 
mulate (IMAC) instructions. The shifter is implemented immediately before MAC accu- 
mulator input. The accumulator shifter output can be inverted or forced to zero and 
linkages are provided to shift into and out of the condition code carry (C) bit. The accu- 
mulator shifter outputs to the AS2, AS1, and ASO buses in the internal ALU. 
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3.1.5 Output Shifter (OS) 


The Output shifter is an asynchronous parallel shifter with 40-bit input and a 40-bit out- 
put. This shifter operates a 15-bit left shift on the result of the integer operations IMPY/ 
IMAC before storing the shifters result into an accumulator. The shifted result is then 
available in the A1 or B1 MSP for other arithmetic or logical operations. 


3.1.6 Data Shifter/Limiter 


The data shifter/limiter provides special post processing on data ALU accumulator regis- 
ters when they are read out to the XDB or to other registers. It consists of a shifter fol- 
lowed by a limiting circuit. 


3.1.6.1 Scaling 


The data shifter is capable of shifting data one bit to the left or right as well as passing 
the data unshifted. It has a 16-bit output and a limiting output indicator. The data shifter is 
controlled by the scaling mode bits in the status register. These mode bits permit 
dynamic scaling of fixed point data using the same program code which permits block 
floating point algorithms to be implemented in a regular fashion. FFT routines would typ- 
ically use this feature to selectively scale each butterfly pass. 


3.1.6.2 Limiting 


Saturation arithmetic is provided to selectively limit overflow when reading a data ALU 
accumulator register. Limiting is performed on the data shifter output. If the contents of 
the selected source accumulator can be represented in the destination operand size 
without overflow, the data limiter is disabled and the operand is not modified. If the con- 
tents of the selected source accumulator cannot be represented without overflow in the 
destination operand size, the data limiter will substitute a “limited” data value having 
maximum magnitude and the same sign as the source accumulator. The value of the 
accumulator is not changed. The limited data values are shown in Table 3-1 


Table 3-1 Saturation by the Shifter/limiter 


E bit MSB of A2/B2 Output of the limiter 
0 x unchanged 
1 0 $7FFF 
1 1 $8000 


The E bit is the extension bit of the status register (SR) which is defined Section 5.3.6. 
Note that during the TFR2 instruction, the limiting is performed on 32 bits when the accu- 
mulator is written to a register. 
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3.2 THE DATA ALU ARITHMETIC AND ROUNDING 


The DSP56100 family supports the two’s-complement representation of binary numbers. 
In this format, the sign bit is the MSB of the binary word, which is set to zero for positive 
numbers and set to one for negative numbers. Unsigned numbers are only supported by 
instructions dedicated to multiple precision. 


3.2.1 Data Representation 
Three modes of format adjustments are supported by the 16-bit DSP: 


1. Two’s complement fractional. In this format, the N bit operand is represented 
using the 1.[N-1] format (1 signed bit, N-1 fractional bits). Such a format can 
represent numbers between -1 and +1-21N-1] 


2. Unsigned fractional. Unsigned binary numbers may be thought of as positive 
only. The unsigned numbers have nearly twice the magnitude of a signed number 
of the same length. An unsigned fraction, D, is a number whose magnitude 
satisfies the inequality: 

0.0<D<2.0 
Examples of unsigned fractional numbers are 0.25, 1.25, and 1.999. The binary 
word is interpreted as having a binary point after the most significant bit (MSB). 
The most positive number is $FFFF or {1.0 + (1 - 2 TN] = 1.99996948 (for 
N=16 bits). The smallest positive number is zero ($0000). 


3. Two’s complement integer. This format is used by two instructions, the integer 
multiply and multiply-accumulate (IMPY/IMAC). Using this format, the N-bit 
operand is represented using the N.0 format (N integer bits). Such a format can 
represent numbers between -27!N-1] and [2!N-1]-1]. 


The operand is written to the most significant accumulator register (A1 or B1) and its 
most significant bit is automatically sign extended through the accumulator extension 
register to maintain alignments of the binary point when a word operand is written to A or 
B. The least significant accumulator register is automatically cleared. See Figure 3-3 for 
more details on bit weighting and operand alignments 
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16-bit word operand 
X0,X1,Y0,Y1,A1,B1 


a1 5 a1 6 
32-bit long word 
operand 


-16 
o15 2 


40-bit word operand 
A,B 


Fractional 2’s Complement Representation 


16-bit word operand 
X0,X1,Y0,Y1,A1,B1 


16-bit word result 


in A1,B1 unused 


Integer 2’s Complement Representation 


Figure 3-3 
Bit Weighting and Alignments for Operands in 
Fractional and Integer Representation 


3.2.2 Fractional Arithmetic 


Figure 3-4 shows the Multiply-Accumulation implementation for fractional arithmetic. The 
multiplication of two 16-bit signed fractional operands gives a 32-bit signed fractional 
intermediate result with the LSB always set to zero. This intermediate result is added to 
one of the 40-bit accumulators. If rounding is specified in the MPY or MAC instruction 
(MACR or MPYR), the intermediate result will be rounded to 16 bits before being stored 
back to the destination accumulator 
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Signed Fractional 
Input Operands 


Intermediate 
Multiplier Result 


Signed Fractional 
Mac Output 


40 bits 


Figure 3-4 Fractional Arithmetic 


3.2.3 Integer Arithmetic 


Figure 3-5 shows the Multiply and Multiply-Accumulate operations for integer arithmetic 
and Figure 3-6 describes the implementation of the Integer Multiply-Accumulate. The 
multiplication/multiply-accumulate of two 16-bit signed integer operands (IMPY/IMAC) 
gives a 16-bit signed integer result in the MSP (A1 or B1). EXT (A2 or B2) is sign 
extended and the LSP (AO or BO) is unchanged. Since AO and BO remain unchanged by 
integer arithmetic instructions, these two registers can be used as two additional data 
ALU registers when using IMAC, IMPY, INC24, DEC24, CLR24, SWAP, and EXT 
instructions. Full precision 40-bit integer operations are possible using a fractional MPY 
or a series of MACs followed by an ASR instruction. 


CAUTION 
Overflow control and rounding are not performed during inte- 
ger multiplication and integer multiply-accumulate. 


Integer arithmetic is optimized for new address generation using the multiplier. For 
example, when an address register Rn has to be updated to Rn + x0*y0 before fetching 
new data from memory, the following sequence of code can be used: 


move Rn,a ca=Rn 
imac x0,y0,a ;a1=Rn+x0*y0 
move X:(a1),b 301=X:<Rn+x0*y0> 
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Input Operand 1 Input Operand 2 


Signed Integer 
Input Operands 


16 bits 16 bits 


16 bits 


Signed 
Intermediate 
Multiplier Result 


S Ext. 
EXP unchanged 


Signed Integer 
Output 


Figure 3-5 Integer Arithmetic (IMPY/IMAC) 


16.0 16.0 


Multiply 


| 


Output Shifter 


Figure 3-6 IMAC Implementation 
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3.2.4 Multiprecision Arithmetic Support 


A set of data ALU operations is provided in order to facilitate multi-precision multiplica- 
tions. When these instructions are used, the multiplier accepts some combinations of 
signed twos-complement format and unsigned format. These instructions are: 


1. MPY/MAC su: multiplication and multiply-accumulate with signed times 
unsigned operands 


2. MPY/MAC uu: multiplication and multiply-accumulate with unsigned times 
unsigned operands 


3. DMACss: multiplication with signed times signed operands and 16-bit 
arithmetic right shift of the accumulator before accumulation 


4. DMACsu: multiplication with signed times unsigned operands and 16-bit 
arithmetic right shift of the accumulator before accumulation 


5. DMACuu: multiplication with unsigned times unsigned operands and 16- 
bit arithmetic right shift of the accumulator before accumulation 


Figure 3-7 shows how the DMAC instruction is implemented inside the Data ALU and 
Figure 3-8 illustrates the use of these instructions in the case of a double precision multi- 
plication. The signed x signed operation is used to multiply or multiply-accumulate the 
two upper, signed, portions of two signed double precision numbers. The unsigned x 
signed operation is used to multiply or multiply-accumulate the upper, signed, portion of 
one double precision number with the lower, unsigned, portion of the other double preci- 
sion number. The unsigned x unsigned operation is used to multiply or multiply-accumu- 
late the lower, unsigned, portion of one double precision number with the lower, 
unsigned, portion of the other double precision number. 
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1.15 1.15 


Multiply 


Accumulator Shifter | 
| | 25.15 + 1.31 


Accumulate 


Figure 3-7 DMAC Implementation 


3.2.5 Rounding Modes 


The DSP56100 family implements two types of rounding: convergent rounding and two’s 
complement rounding. The type of rounding is selected by the OMR rounding bit (R bit). 


3.2.5.1 Convergent Rounding 


This is the default rounding mode. Convergent rounding is also called round-to-nearest 
even number. It prevents the introduction of a bias normally produced by rounding down 
if the number is odd (LSB=1) and rounding up if the number is even (LSB=0). Figure 3-9 
shows the four possible cases for rounding a number in the A1 or B1 register. If the Least 
Significant Portion (LSP) of a number is less than half ($<8000) of the bit to be rounded 
(LSB), the number is rounded down and if the LSP of the number is greater than half of 
the LSB (>$8000) the number is rounded up. If the LSP is exactly equal to half of the 
LSB ($8000) and the LSB of the MSP is odd, the number is rounded up whereas if the 
LSB of the MSP is even, the number is rounded down i.e., truncated. This technique 
eliminates the bias in truncation rounding. 


Block diagrams of the rounding implementations for the cases of no scaling, scaling 
down and scaling up are shown in Figure 3-9, Figure 3-10, and Figure 3-11, respectively. 
Scaling modes require that the zero detect hardware and LSB Even gate have one of 
three forms since the LSB moves with the scaling mode. 
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32 bits 


XH | 
X41 X 
| 


YH 


71 
Unsigned X Unsigned 


$< | XL x YL | 


mpyuu x0,y0,a 
move a0,b0 


Signed X Unsigned 
dmacsu x1,y0,a XH x YL 


macsu y1,x0,a > YH x XL 
move a0,b1 Signed X Signed 
dmacss xtyta <5 | XH x YH | 


| a2 Ai | ao | 


< 


64 bits 
Figure 3-8 Double Precision Multiplication 
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CASE II: A0>0.5 (>$8000), then round up (add 1 to A1) 
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CASE Ill: A0=0.5 (=$8000) and LSB of A1=0 (even), then round down (add zero to A1) 
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CASE IV: A0=0.5 (=$8000) and LSB of A1=1(odd), then round up (add 1 to A1) 
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Figure 3-9 Convergent Rounding 
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Figure 3-10 Convergent Rounding Implementation — No Scaling 
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Figure 3-11 Convergent Rounding Implementation — Scale Down 


3.2.5.2 Two’s Complement Rounding 


When twos-complement rounding is selected by setting the rounding bit in the OMR, one 
is added to the bit to the right of the rounding point (bit 15 of AO when no-scaling; bit 0 of 
A1 when scaling down; bit 14 of AO when scaling up) before the bit truncation during a 
rounding operation. Figure 3-12 shows the two possible cases. 
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CASE |: AO < 0.5 (<$8000), then round down 


Before Rounding 


Al 


AO 


XX...XX0100 


O11XXX...XX 


31 


CASE Il: AO + 0.5 (+$8000), then round up 
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Figure 3-12 Two’s Complement Rounding (No-scaling) 


Once the rounding bit has been programmed in the OMR,, there is a delay of one instruc- 
tion cycle before the new rounding mode becomes active. 
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4.1 INTRODUCTION 
The major components of the AGU are: 


¢ Address Register Files 
¢ Offset Register Files 
* Modifier Register Files 
¢ Address Arithmetic Unit Containing: 
— Temporary Address Register 
— Local Status Register 
— PC Relative Addressing Unit 
— Secondary Offset Adder Unit 
— Modulo Arithmetic Unit 
— Address Output Multiplexer 
A block diagram of the AGU is shown in Figure 4-1. 


4.2 ADDRESS REGISTER FILE (Rn) 

The Address Register File consists of four, sixteen-bit registers. The file contains the ad- 
dress registers RO-R3 which usually contain addresses used as pointers to memory. Each 
register may be read or written by the Global Data Bus. High speed access to the XAB1 
and XAB2 buses is required to allow maximum access time for the internal and external 
X Data Memory and Program Memory. Each address register may be used as an input to 
the modulo arithmetic unit for a register update calculation. Each register may be written 
by the Global Data Bus or by the output of the modulo arithmetic unit. 


R2, R3 and Temp may be used as inputs to a separate offset adder for an independent 
register update calculation. This special update calculation occurs during parallel, dual 
reads (using R3) and during offset by absolute immediate offsets (using R2+$xx). 


CAUTION 


Due to pipelining, if an address register (M, N, or R) is changed 
with a MOVE instruction, the new contents will not be available for 
use as a pointer until the second following instruction. 


4.3. OFFSET REGISTER FILE (Nn) 

The Offset Register File consists of four, sixteen-bit registers. The file contains the offset 
registers NO-N3 and usually contains offset values used to update address pointers. Each 
offset register may be read or written by the Global Data Bus. Each offset register is read 
when the same number address register is read and used as an input to the modulo arith- 
metic unit. 
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Figure 4-1 AGU Block Diagram 


ctrl 


4.4 MODIFIER REGISTER FILE (Mn) 
The Modifier Register File consists of four, 16-bit registers. The file contains the modifier 
registers MO-M3 and usually specifies the type of arithmetic used to modify an address 
register during address register update calculations. Each modifier register may be read 
or written by the Global Data Bus. Each modifier register is read when the same number 
address register is read and used as an input to the modulo arithmetic unit. Each modifier 
register is preset to $FFFF during a processor reset. 


4.5 TEMPORARY ADDRESS REGISTER 
The temporary address register, Temp, is a 16-bit register which provides for: 
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1. temporary storage for an absolute address loaded from the Program Data Bus, 


. the immediate data loaded from the Global Data Bus, 


. Address Register Indirect with Immediate Displacement addressing mode, 


kk WO PY 


. the contents of A1 or B1 registers used by the Accumulator Register Indirect 
Addressing mode, or 


5. the output of the modulo arithmetic unit. 


The modulo arithmetic unit output is loaded into the Temp register during the pre-update 
cycle of the indexed by offset addressing mode, of the pre-decrement addressing mode, 
and during the LEA instruction. In each of these addressing modes, an address register 
is accessed, updated by the modulo arithmetic unit, and stored in Temp in one instruction 
cycle. In the following cycle, the content of Temp is used to address the X memory. For 
all absolute addressing modes, the address of the operand is written into Temp and then 
used to address X: or P: memory. 


4.6 AGU STATUS REGISTER 

The 3-bit local status register in the AGU, which cannot be accessed by the user, will be 
updated after every register update; i.e., only those addressing modes that update the ad- 
dress register regardless of memory access type. 


Updating of the local status register is as follows: 
sr_v < set ifthe modulo circuit performed a wrap, clear otherwise. 
sr_z < setif the result of the address update is zero, clear otherwise. 
sr_n < setif the result of the address update is negative, clear otherwise. 


The CHKAAU instruction will copy the AGU status register to SR as follows: 


V < sv 
Z «+ sr z 
N < sr_n 


During double parallel reads, only the update of the address register used for the first par- 
allel read (not r3) will affect the local status register. 


Note: Only the V, Z, N bits of SR will be changed. 
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4.7 PC RELATIVE ADDRESSING UNIT 

The PC Relative Addressing Unit performs the PC relative address computation with sign 
extension done on the program address offset. The result is gated onto the Program Ad- 
dress Bus by a control signal from the program controller. 


4.8 SECONDARY OFFSET ADDER UNIT 

The Secondary Offset Adder Unit is used for an address update calculation during double 
data memory read instructions, or for the addition of address register and immediate dis- 
placement. 


4.9 MODULO ARITHMETIC UNIT 

The Modulo Arithmetic Unit contains one 16-bit full adder (called the offset adder) which 
may add one, subtract one, or add the contents of the respective signed offset register N 
to the contents of the selected address register. A second full adder (called the modulo 
adder) adds the summed result of the first full adder to a modulo value M or minus M, 
where M is stored in the respective modifier register. A third full adder (called the reverse 
carry adder) adds the constant one, minus one, the offset N (stored in the respective offset 
register) to the selected address register with the carry propagating in the reverse direc- 
tion, from the most significant bit to the least. The offset adder and the reverse carry adder 
are in parallel and share common inputs. Test logic determines which of the three 
summed outputs of the full adders is output to the address register file or temporary reg- 
ister. 


The modulo arithmetic unit can update one address register, Rn, during one instruction 
cycle. It is capable of performing linear, reverse carry, and modulo arithmetic. The con- 
tents of the selected modifier register specifies the type of arithmetic required in an ad- 
dress register update calculation. The modifier value is decoded in the modulo arithmetic 
unit and affects the unit’s operation. The modulo arithmetic unit’s operation is data-depen- 
dent and requires execution cycle decoding of the selected modifier register contents. 
Note that for dual reads, there is no modulo capability for an R3 update, linear arithmetic 
will be used. 


The output of the offset adder gives the result of linear arithmetic (e.g. Rn+1; Rn+N) and 
is selected as the modulo arithmetic unit’s output for linear arithmetic addressing modifi- 
ers. The reverse carry adder performs the required operation for reverse carry arithmetic 
and its output is selected as the modulo arithmetic unit’s output for reverse carry address- 
ing modifiers. Reverse carry arithmetic is useful for 2‘ point FFT addressing. For modulo 
arithmetic, the modulo arithmetic unit will perform the function (Rn+N) modulo M where N 
can be one, minus one, or the contents of the offset register Nn. If the modulo operation 
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requires wraparound for modulo arithmetic, the summed output of the modulo adder will 
give the correct updated address register value; otherwise, if wraparound is not neces- 
sary, the output of the offset adder gives the correct result. 


The test logic will determine which output address to select. If the contents of the respec- 
tive modifier register, M, specify linear or reverse carry arithmetic, the output of the mod- 
ulo arithmetic unit will be the output of the offset adder or reverse carry adder, 
respectively. If M specifies a modulo value (modulo arithmetic) the output of the modulo 
arithmetic unit will be based on the results or both the offset and modulo adders. 


The modulo arithmetic unit is also used in a special way during execution of the NORM 
instruction. For the NORM instruction, the modulo arithmetic unit computes three values: 
Rn, Rn-1 and Rn+1. Depending on the result of the Data ALU operation, one of the three 
is selected for the register update. (See the NORM instruction in Appendix A) 


4.10 ADDRESSING MODES 

The DSP56100 family instruction set contains a full set of operand addressing modes. All 
address calculations are performed in the Address Generation Unit to minimize execution 
time and loop overhead. 


Addressing modes specify whether the operand(s) is in a register or memory and provide 
the specific address of the operand(s). An effective address in an instruction will specify 
an addressing mode, and for some addressing modes, the effective address will further 
specify an address register. In addition, address register indirect modes require additional 
address modifier information which is not encoded in the instruction. The address modifier 
information is specified in the selected address modifier register(s). All memory referenc- 
es require one address modifier and the dual X memory reference requires one or two ad- 
dress modifiers. The definition of certain instructions implies the use of specific registers 
and the addressing modes used. 


Address register indirect modes require an offset and a modifier register for use in ad- 
dress calculations. These registers are implied by the address register specified in an ef- 
fective address in the instruction word. Each offset register Nn and each modifier register, 
Mn, is assigned to an address register, Rn, having the same register number, n, forming 
a triplet. Thus the assigned triplets are MO;NO;RO, M1;N1;R1, M2;N2;R2, and M3;N3;R3. 
The address register Rn is used as the address register, the offset register, Nn, is used 
to specify an optional offset and the modifier register Mn is used to specify an addressing 
mode modifier. 
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The addressing modes are grouped into three categories: register direct, address register 
indirect, and special. These addressing modes are described below and summarized in 
Table 4-1. 


4.10.1 Register Direct Modes 
These effective addressing modes specify that the operand is in one (or more) of the 10 


Data ALU registers, 12 address registers or 7 control registers. 


4.10.1.1 Data or Control Register Direct 
The operand is in one, two, or three Data ALU register(s) as specified in a portion of the 


data bus movement field in the instruction. This addressing mode is also used to specify 
a control register operand for special instructions. This reference is classified as a register 
reference. 


4.10.1.2 Address Register Direct 
The operand is in one of the 12 address registers (Rn, Mn, and Nn) specified by an effec- 
tive address in the instruction. This reference is classified as a register reference. 


CAUTION 


Due to pipelining, if an address register (Mn, Nn, or Rn) is changed with a 
MOVE instruction, the new contents will not be available for use as a pointer 
until the second following instruction. 


4.10.2 Address Register Indirect Modes 
The effective address in the instruction specifies the address register Rn and the address 


calculation to be performed. These addressing modes specify that the operand(s) is in 
memory and provide the specific address of the operand(s). When an address register is 
used to point to a memory location, the addressing mode is called address register indi- 
rect. The term indirect is used because the operand is not the address register itself, but 
the contents of the memory location pointed to by the address register. A portion of the 
data bus movement field in the instruction specifies the memory reference to be per- 
formed. The type of address arithmetic used is specified by the address modifier register, 
Mn. 


4.10.2.1_ No Update (Rn) 
The address of the operand is in the address register Rn. The contents of the Rn register 


are unchanged. The Mn and Nn registers are ignored. This reference is classified as a 
memory reference. 
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4.10.2.2 Postincrement by 1 (Rn)+ 
The address of the operand is in the address register Rn. After the operand address is 


used, it is incremented by 1 and stored in the same address register. The type of arith- 
metic used to increment Rn is determined by Mn. The Nn register is ignored. This refer- 
ence is classified as a memory reference. 


4.10.2.3. Postdecrement by 1 (Rn)- 
The address of the operand is in the address register Rn. After the operand address is 


used, it is decremented by 1 and stored in the same address register. The type of arith- 
metic used to increment Rn is determined by Mn. The Nn register is ignored. This refer- 
ence is classified as a memory reference. 


4.10.2.4 Postincrement by Offset Nn (Rn)+Nn 

The address of the operand is in the address register Rn. After the unsigned operand ad- 
dress is used, the contents of the Nn register are added to Rn and stored in the same ad- 
dress register. The content of Nn is treated as a 2’s complement number and can there- 
fore be interpreted as signed or unsigned. The contents of the Nn register are unchanged. 
The type of arithmetic used to increment Rn is determined by Mn. This reference is clas- 
sified as a memory reference. 


4.10.2.5 Indexed by Offset Nn (Rn+Nn) 
The address of the operand is the sum of the contents of the address register Rn and the 


contents of the address offset register Nn. This addition occurs before the operand can 
be accessed and therefore requires an extra instruction cycle. The content of Nn is treated 
as a 2’s complement number and can therefore be interpreted as signed or unsigned. The 
contents of the Rn and Nn registers are unchanged. The type of arithmetic used to add 
Nn to Rn is determined by Mn. This reference is classified as a memory reference. 


4.10.2.6 Predecrement by 1 -(Rn) 
The address of the operand is the contents of the address register Rn decremented by 1. 


Before the operand address is used, it is decremented (subtracted) by 1 and stored in the 
same address register. The type of arithmetic used to increment Rn is determined by Mn. 
The Nn register is ignored. This reference is classified as a memory reference. 


4.10.3. PC Relative Modes 
In the PC relative addressing modes used in the BRA and DO instructions, the address 


of the operand is obtained by adding a displacement, represented in two’s complement 
format, to the value of the program counter (PC). The PC always points to the address of 


MOTOROLA ADDRESS GENERATION UNIT (AGU 4-9 


For More Information On This Produc 
Go to: www.freescale.com 


| ADDRESSING MODES 


the next instruction, so PC relative addressing with zero displacement will produce the ad- 
dress of the next sequential instruction in program memory. 


4.10.3.1 Long Displacement PC Relative 

This addressing mode requires one word of instruction extension. The address of the op- 
erand is the sum of the contents of the PC and the extension word. This reference is clas- 
sified as a register reference. 


4.10.3.2 Short Displacement PC Relative 
The short displacement occupies 8 bits in the instruction operation word. The displace- 


ment is first sign extended to 16 bits and then added to the PC to obtain the address of 
the operand. This reference is classified as both a register reference and a memory ref- 
erence. 


4.10.3.3. Address Register PC Relative 

The address of the operand is the sum of the contents of the address register Rn and the 
PC. The Mn and Nn registers are ignored. This reference is classified as a register refer- 
ence. 


4.10.4 Special Address Modes 
The special address modes do not use an address register in specifying an effective ad- 


dress. These modes specify the operand or the address of the operand in a field of the 
instruction or they implicitly reference an operand. 


4.10.4.1_ Upper Word of Accumulator 
This addressing mode uses the contents of either A1 or B1 to address an operand in 


memory. No update is performed. It is available for single parallel memory moves. This 
reference is classified as an X memory reference. 


4.10.4.2 Immediate Data 

This addressing mode requires one word of instruction extension. The immediate data is 
a word operand in the extension word of the instruction. This reference is classified as a 
program reference. 
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4.10.4.3 Immediate Short Data 
The 8-bit operand is in the instruction operation word. The 8-bit operand is used for the 


ANDI, DO, ORI, and REP instructions in addition to the immediate move to register in- 
struction. This reference is classified as a program reference. 


4.10.4.4 Absolute Address 

This addressing mode requires one word of instruction extension. The address of the op- 
erand is in the extension word. This reference is classified as both a memory reference 
and a program reference. 


4.10.4.5 Absolute Short Address 
For the Absolute Short addressing mode the address of the operand occupies 5 bits in the 


instruction operation word and is zero extended. This reference is classified as both a 
memory reference and a program reference. 


4.10.4.6 Short Jump Address 
The operand occupies 8 bits in the instruction operation word. The address is zero extend- 


ed to 16 bits and is unsigned. This reference is classified as a program memory reference. 


4.10.4.7. I/O Short Address 
For the I/O short addressing mode the address of the operand occupies 5 bits in the in- 


struction operation word and is one’s extended. I/O short is used with the bit manipulation 
and move peripheral data instructions. This reference is classified as an X memory refer- 
ence. 


4.10.4.8 Implicit Reference 
Some instructions make implicit reference to the program counter (PC), system stack 


(SSH, SSL), loop address register (LA), loop counter (LC), or status register (SR). The 
registers implied and their use are defined by the individual instruction descriptions (see 
Appendix A). This reference is classified as both a register reference and a program ref- 
erence. 


4.10.4.9 Indexed by Short Displacement 
This addressing mode uses one extension word which contains the 8-bit short index and 


precedes the opcode word. The index requires an extra instruction cycle and always in- 
dexes address register R2. This addressing mode is available for MOVEM and MOVEC 
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instructions as well as single parallel memory moves. This reference is classified as an X 
memory reference. 


4.10.5 Addressing Modes Summary 
Table 4-2 contains a summary of the addressing modes discussed in the previous para- 


graphs. 


4.11 ADDRESS MODIFIER TYPES 

The DSP56100 family Address Generation Unit supports linear, modulo, and bit-reversed 
address arithmetic for all address register indirect modes. Address modifiers determine 
the type of arithmetic used to update addresses. Address modifiers allow the creation of 
data structures in memory for FIFOs (queues), delay lines, circular buffers, stacks, and 
bit-reversed FFT buffers. Data is manipulated by updating address registers (pointers) 
rather than moving large blocks of data. The contents of the address modifier register, Mn, 
defines the type of address arithmetic to be performed for addressing mode calculations, 
and for the case of modulo arithmetic, the contents of Mn also specifies the modulus. All 
address register indirect modes may be used with any address modifier type. Each ad- 
dress register Rn has its own modifier register Mn associated with it. 


4.11.1 Linear Modifier 

The address modification is performed using normal 16-bit (modulo 65,536) two’s com- 
plement linear arithmetic. A 16-bit offset Nn, or immediate data (+1, -1, or a displacement 
value) may be used in the address calculations. The range of values may be considered 
as signed (Nn from -32,768 to +32,767) or unsigned (Nn from 0 to +65,536). There is no 
arithmetic differences between these two data representations. Addresses are normally 
considered unsigned, data is normally considered signed. 


4.11.2 Reverse Carry Modifier 

The address modification is performed by propagating the carry in the reverse direction, 
i.e., from the MSB to the LSB. This is equivalent to bit-reversing the contents of Rn and 
the offset value Nn, adding normally, and then bit-reversing the result. If the (Rn)+Nn ad- 
dressing mode is used with this address modifier, and Nn contains the value 2"! (a power 
of two), then postincrementing by Nn is equivalent to bit-reversing the k LSBs of Rn, in- 
crementing Rn by 1, and bit-reversing the k LSBs of Rn again. This address modification 
is useful for 2 point FFT addressing. The range of values for Nn is 0 to +32,767. This al- 
lows bit-reversed addressing for FFTs up to 65,536 points. 
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As an example, consider a 1024 point FFT with real data stored in one section of data 
RAM and imaginary data stored in another section of data RAM. Then Nn would contain 
the value 512 and postincrementing by +N would generate the address sequence 0, 512, 
256, 768, 128, 640, ... This is the scrambled FFT data order for sequential frequency 
points from 0 to 2x. For proper operation the reverse carry modifier restricts the base ad- 
dress of the bit reversed data buffer to an integer multiple of 2‘, such as 1024, 2048, 3072, 
etc. The use of addressing modes other than postincrement by Nn is possible but may not 
provide a useful result. 


4.11.3. Modulo Modifier 

The address modification is performed modulo M, where M is permitted to range from 2 
to +32,768. Modulo M arithmetic causes the address register value to remain within an 
address range of size M defined by a lower and upper address boundary. The value M-1 
is stored in the modifier register Mn, thus allowing a modulo size range from 2 to 32,768. 
The lower boundary (base address) value must have zeroes in the k LSBs, where 2 > M, 
and therefore must be a multiple of 2. The upper boundary is the lower boundary plus the 
modulo size minus one (base address plus M-1). 


For example, to create a circular buffer of 24 stages, M is chosen as 24 and the lower ad- 
dress boundary must have its 5 LSBs equal to zero (2* > 24, thus k > 5). The Mn register 
is loaded with the value 23 (M-1). The lower boundary may be chosen as 0, 32, 64, 96, 
128, 160, etc. The upper boundary of the buffer is then the lower boundary plus 23. 


The address pointer is not required to start at the lower address boundary and may begin 
anywhere within the defined modulo address range. In fact, the initial location of Rn de- 
termines the lower and upper boundaries. The upper and lower boundaries are not explic- 
itly needed. If the address register pointer increments past the upper boundary of the buff- 
er (base address plus M-1) it will wrap around to the base address. If the address decre- 
ments past the lower boundary (base address) it will wrap around to the base address 
plus M-1. 


If an offset Nn is used in the address calculations, the 16-bit value Nn must be less than 
or equal to M for proper modulo addressing. This is because a single modulo wrap around 
is detected. If Nn is greater than M, the result is data dependent and unpredictable except 
for the special case where Nn=L*(2k), a multiple of the block size, 2“, where L is a positive 
integer. Note that the offset Nn must be a positive two’s complement integer. For this case 
the pointer Rn will be incremented using linear arithmetic to the same relative address L 
blocks forward in memory. For the normal case where Nn is less than or equal to M, the 
modulo arithmetic unit will automatically wrap the address pointer around by the required 
amount. This type of address modification is useful in creating circular buffers for FIFOs 
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Table 4-1 DSP56100 Family Addressing Modes 


Operand Reference 
Uses Mn a 

Addressing Mode Modifier $|C;}D); A| P| X/XxX 
Register Direct 
Data or Control Register No X| X] X 
Address Register Rn No X 
Address Modifier Register MnNo X 
Address Offset Register Nn No X 
Address Register Indirect 
No Update No X | X 
Postincrement by 1 Yes* X| X] X 
Postdecrement by 1 Yes X | X 
Postincrement by Offset Nn Yes* X| XX 
Indexed by Offset Nn Yes X 
Predecrement by 1 Yes X 
PC Relative 
Long Displacement No X 
Short Displacement No X X 
Address Register No X X 
Special 
Upper word of accumulator No X 
Immediate Data No X 
Immediate Short Data No X 
Absolute Address No X | X 
Absolute Short Address No X| X 
Short Jump Address No X 
I/O Short Address No X 
Implicit No X | X X 
Indexed by short displacement No X 
Where: 

S = System Stack Reference 

P = Program Memory Reference 

C =Program Controller Register Reference 

X = X Memory Reference 

D = Data ALU Register Reference 

XX = Double X Memory Read 

A = Address ALU Register Reference 

*note: M3 is not used for updating R3 in the second read in the X memory 
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(queues), delay lines, and sample buffers up to 32,768 words long. It is also used for dec- 
imation, interpolation, and waveform generation. The special case of (Rn)+Nn with 
Nn=L*(2k) is useful for performing the same algorithm on multiple buffers, for example im- 
plementing a bank of parallel filters. The range of values for Nn is -32,768 to +32,767 al- 
though all values are not useful when modulo addressing as described above. 


4.11.4 Wrap-Around Modulo Modifier 
The address modification is performed modulo M, where M may be any power of 2 in the 


range from 2' to 2'®. Modulo M arithmetic causes the address register value to remain 
within an address range of size M defined by a lower and upper address boundary. The 
lower boundary (base address) value must have zeroes in the k LSBs, where 2“ = M, and 
therefore must be a multiple of 2“. The upper boundary is the lower boundary plus the 
modulo size minus one (base address plus M-1). 


For example, to create a circular buffer of 32 stages, M is chosen as 32 and the lower ad- 
dress boundary must have its 5 LSBs equal to zero (2* = 32, thus k = 5). The Mn register 
is loaded with the value $001F. The lower boundary may be chosen as 0, 32, 64, 96, 128, 
160, etc. The upper boundary of the buffer is then the lower boundary plus 31. 


The address pointer is not required to start at the lower address boundary and may begin 
anywhere within the defined modulo address range (between the lower and upper bound- 
aries). If the address register pointer increments past the upper boundary of the buffer 
(base address plus M-1) it will wrap around to the base address. If the address decre- 
ments past the lower boundary (base address) it will wrap around to the base address 
plus M-1. If an offset Nn is used in the address calculations, the 16-bit value Nn is required 
to be less than or equal to M for proper modulo addressing since multiple wrap around is 
not supported. The range of values for Nn is -32,768 to +32,767. 


This type of address modification is useful for decimation, interpolation, and waveform 
generation since the multiple wrap-around capability may be used for argument reduction. 
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4.11.5 Address Modifier Type Encoding Summary 
Table 4-2 contains a summary of the address modifier types discussed in the previous 


paragraphs. 


Table 4-2 Addressing Mode Modifier Summary 


16-bit Modifier Reg. (MO-M3) 

MMMMMMMMMMMMMMMM Address Calculation Arithmetic 
0000000000000000 Reverse Carry (Bit Reversed) 
0000000000000001 Modulo 2 
0000000000000010 Modulo 3 
0111441111111110 Modulo 32767 
0111111111111111 Modulo 32768 
1000000000000000 Reserved 
44444111111141110 Reserved 
1111111111111111 Linear (Modulo 65536) 

where MMMMMMMMMMMMMMMM = 16-bit Modifier Reg. Contents 
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5.1. INTRODUCTION 

The PCU performs program address generation (instruction prefetch), instruction decod- 
ing, hardware DO-loop control, and exception processing. The programmer views the 
PCU as consisting of six registers and a hardware system stack (SS) as shown on Fig- 
ure 5-1. In addition to the standard program flow-control resources, such as a program 
counter (PC), complete status register (SR), and SS, the PCU features registers (loop 
address LA and loop counter LC) dedicated to supporting the hardware DO loop instruc- 
tion. 


16 bit 16 bit 16 bit 
<< << —_—_ 


PC MR | CCR OMR 


16 bit 16 bit 
LA LC 


SSH SSL 6 bit 
SP 


| 


OONDOARWN— 


10 


16 bit 16 bit 
>< 


Figure 5-1 Program Control Unit Block Diagram 


5.2 PROGRAM COUNTER (PC) 

This 16-bit register contains the address of the next location to be fetched from Program 
Memory Space. The PC may point to instructions, data operands or addresses of oper- 
ands. References to this register are always inherent and are implied by most instruc- 
tions. This special purpose address register is stacked when program looping is initiated, 
when a branch or a jump to subroutine is performed, and when interrupts occur except 
for fast interrupts (refer to Section 7.3.4.1). 
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5.3. STATUS REGISTER (SR) 

The status register is a 16-bit register consisting of an 8-bit Mode register (MR) and an 8- 
bit Condition Code register (CCR). The MR register is the high-order 8 bits of the status 
register; the CCR register is the low-order 8 bits. 


The MR bits are only affected by processor reset, exception processing, the DO, 
ENDDO, RTI, and SWI instructions and by instructions which directly reference the MR 
register (e.g., ANDI, ORI). During processor reset, the interrupt mask bits of the 
mode register will be set, the scaling mode bits, loop flag, sticky bit, and the for- 
ever flag will be cleared. The CCR is a special purpose control register which defines 
the current user state of the processor at any given time. The CCR bits are affected by 
data ALU operations, one address ALU operation (CHKAAU), bit field manipulation 
instructions, parallel move operations, and by instructions which directly reference the 
CCR register. The CCR bits are not affected by data transfers over XDB except if data 
limiting occurs when reading the A or B accumulators. During processor reset, all CCR 
bits are cleared. The standard definition of the CCR bits is given below. Refer to Appen- 
dix A, Section A.3 for the complete CCR bit computation rules. The SR register is 
stacked when program looping is initialized when a jump or branch to subroutine (JSR, 
BSR) is performed, and when interrupts occur, except for fast interrupts (refer to Section 
7.3.4.1). The status register format is shown in Figure 5-2 and is described below. 


5.3.1 Carry (Bit 0) 

The carry (C) bit is set if a carry is generated out of the most significant bit of the result 
for an addition. Also set if a borrow is generated in a subtraction. The carry or borrow is 
generated out of bit 39 of the result. The carry bit is also modified by bit manipulation, 
rotate, and shift instructions. Otherwise, this bit is cleared. This bit is cleared on hard- 
ware reset. 


5.3.2 Overflow (Bit 1) 

The overflow (V) bit is set if an arithmetic overflow occurs in the result. This indicates that 
the result is not representable in the accumulator register and the accumulator register 
has overflowed. Otherwise, this bit is cleared. 


5.3.3 Zero (Bit 2) 
The zero (Z) bit is set if the result equals zero. Otherwise, this bit is cleared. 


5.3.4 Negative (Bit 3) 
The negative (N) bit is set if the most significant bit 39 of the result is set. Otherwise, this 
bit is cleared. 


5-4 PROGRAM CONTROL UNIT (PCU MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


| STATUS REGISTER (SR) 


< MR >< CCR + 
15 14131211109 8|7 6 5 4 3 2 1 #0 


* * 


LF|FV $1}SO} 11} 10) S 


nN rN nN nN nN A 4» aN nN rN rN rN nN 


| 


Carry 
Overflow 
— Zero 
—— Negative 
L_. Unnormalized 
Extension 
Limit 
Sticky Bit 


Interrupt Mask 
Scaling Mode 
|_____________________________ Reserved 
ee §£_£§£§_| Reserved 
ForeVer Flag 
Loop Flag 


Figure 5-2 Status Register Format 


5.3.5 Unnormalized (Bit 4) 

The unnormalized (U) bit is set if the two most significant bits of the MSP portion of the 
result are the same. Cleared otherwise. The MSP portion is defined by the scaling mode 
and the U bit is computed as follows; 


Scaling Mode | U Bit Computation 


No scaling (Bit 31 xor Bit 30) 


U = 
Scale down U = (Bit 32 xor Bit 31) 
U = 


Scale up (Bit 30 xor Bit 29) 
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The result of calculating the U bit in this fashion is that the definition of a positive normal- 
ized number, p, is 0.5 < p < 1.0 and the definition of a negative normalized number, n, is 
-1.0<n<-0.5. 


5.3.6 Extension (Bit 5) 

The extension (E) bit is cleared if all the bits of the integer portion of the 40-bit result are 
all the same; that is, the bit patterns 00...00 or 11...11. Set otherwise. The integer por- 
tion is defined by the scaling mode and the E bit is computed as follows: 


Scaling Mode _ | Integer portion 


No scaling Bits 39,38,...,32,31 
Scale down Bits 39,38,...,33,32 
Scale up Bits 39,38,...,31,30 


If E is cleared, then the low-order fraction portion contains all the significant bits - the 
high order integer portion is just sign extension. In this case, the accumulator extension 
register can be ignored. If E is set, it indicates that the extension accumulator is in use. 


5.3.7 Limit (Bit 6) 

The limit (L) bit is set if the overflow bit V is set or if the data shifter/limiters perform a lim- 
iting operation. The limit bit is also set by the saturation of the 32-bit result when the sat- 
uration bit of the operating mode register is set. Not affected otherwise. The L bit is 
cleared only by a processor reset or an instruction which specifically clears it. This allows 
the L bit to be used as a latching overflow bit. Note that L is affected by data movement 
operations which read the A or B accumulator registers onto the XDB or GDB. 


5.3.8 Sticky Bit (Bit 7) 
The Sticky (S) bit is set only on moves of the form F, X:<> (move from accumulator to 
data memory) under the following conditions: 


if no scaling 

set_S=bit 30 XOR bit 29 
if scaling down 
set_S=bit 31 XOR bit 30 
if scaling up 

set_S=bit 29 XOR bit 28 


This test is performed on two bits of the source accumulator. 
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This bit is a sticky bit in the sense that once set, it can only be reset by a MOVE to the 
status register SR or an ANDI #xx,SR. This bit is especially useful for attaining maximum 
accuracy on input data of a block floating point FFT (see Application note APR4/D, 
Implementation of Fast Fourier Transforms on Motorola’s Digital Signal Processors). 


5.3.9 Interrupt Masks (Bits 8,9) 

The interrupt mask bits I1 and 10 reflect the current priority level of the processor and 
indicate the interrupt priority level (IPL) needed for an interrupt source to interrupt the 
processor. The current priority level of the processor may be changed under software 
control. The interrupt mask bits are set during processor reset. 


Exceptions Accepted Exceptions masked 


IPL 0,1,2,3 None 
IPL 1,2,3 IPL O 
IPL 2,3 IPL 0,1 
IPL3 IPL 0,1,2 


5.3.10 Scaling Mode (Bits 10,11) 

The scaling mode bits S1 and SO specify the scaling to be performed in the Data ALU 
shifter/limiter and the rounding position in the Data ALU multiply-accumulator (MAC). 
The scaling modes are shown below. 


Rounding bit Scaling Mode 


15 No Scaling 
16 Scaling Down 
14 Scaling up 
— Reserved 


The shifter/limiter scaling mode affects data read from the A or B accumulator registers 
out to the XDB. Different scaling modes may be used with the same program code to 
allow dynamic scaling. This allows block floating point arithmetic to be performed. The 
scaling mode also affects the MAC rounding position to maintain proper rounding when 
different portions of the accumulator registers are read out to the XDB. This provides 
consistent rounding in block floating point arithmetic. The scaling mode bits are cleared 
at the start of a long interrupt service routine. The scaling mode bits are also cleared dur- 
ing a processor reset. 
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5.3.11 Reserved Status (Bits 12,13) 
These bits are reserved for future expansion and will read as zero during DSP read oper- 
ations. They should be written with zero for future compatibility. 


5.3.12 ForeVer Flag (Bit 14) 

The ForeVer flag (FV) bit is set when a DO FOREVER program loop is in progress and 
enables the detection of the end of a program loop. The FV flag, like the loop flag is 
restored when terminating a DO FOREVER program loop. Stacking and restoring the FV 
flag when initiating and exiting a DO FOREVER program loop, respectively, allows the 
nesting of program loops. The FV flag is cleared at the start of a long interrupt service 
routine. The FV flag is also cleared during a processor reset. 


5.3.13 Loop Flag (Bit 15) 

The loop flag (LF) bit is set when a program loop is in progress and enables the detection 
of the end of a program loop. LF and FV are the only status register bits which are 
restored when terminating a program loop. Stacking and restoring the loop flag when ini- 
tiating and exiting a program loop, respectively, allow the nesting of program loops. The 
loop flag is cleared at the start of a long interrupt service routine. The loop flag is also 
cleared during a processor reset. 


5.4 LOOP COUNTER (LC) 

The loop counter is a special 16-bit counter used to specify the number of times to repeat 
a hardware program loop. This register is stacked by a DO instruction and unstacked by 
end of loop processing or by execution of a BRKcc or an ENDDO instruction. When the 
end of a hardware program loop is reached, the contents of the loop counter register are 
tested for one. If the loop counter is one, the program loop is terminated and the LC reg- 
ister is loaded with the previous LC contents stored on the stack. If the loop counter is 
not one, it is decremented by one and the program loop is repeated. The loop counter 
may be read under program control. This allows the number of times a loop has been 
executed to be determined during execution. Note that if LC=0 during execution of the 
DO instruction, the loop will not be executed and the program will continue with the 
instruction immediately after the loop end of expression. LC is also used in the REP 
instruction. 


5.5 LOOP ADDRESS REGISTER (LA) 

The loop address register indicates the location of the last instruction word in a program 
loop. This register is stacked by a DO instruction and unstacked by end of loop process- 
ing or by execution of an ENDDO instruction. When the instruction word at the address 
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contained in this register is fetched, the content of LC is checked. If it is not one, the LC 
is decremented, and the next instruction is taken from the address at the top of the sys- 
tem stack; otherwise the PC is incremented, the loop flag is restored (pulled from stack), 
the stack is purged, the LA and LC registers are pulled from the stack and restored, and 
instruction execution continues normally. The LA register is a read/write register written 
into by a DO instruction and is read by the system stack for stacking the register. The LA 
register can be directly accessed by some instructions. 


5.6 SYSTEM STACK (SS) 

The system stack is a separate internal RAM, 15 locations “deep”, and divided into two 
banks: High (SSH) and Low (SSL) each 16 bits wide. SSH stores the PC or LA contents; 
SSL stores the LC or SR contents. 

The PC and SR registers are pushed on the stack for subroutine calls and long inter- 
rupts. These registers are pulled from the stack for subroutine returns using the RTS 
instruction and for interrupt returns that use the RTI instruction. The system stack is also 
used for storing the address of the beginning instruction of a hardware program loop as 
well as the SR, LA, and LC register contents just prior to the start of the loop. This allows 
nesting of DO loops. 


Up to 15 long interrupts, 7 DO loops, or 15 JSRs or combinations of these can be accom- 
modated by the Stack. Care must be taken when approaching the stack limit. When the 
Stack limit is exceeded the data to be stacked will be lost and a non-maskable Stack 
Error interrupt will occur. The stack error interrupt occurs after the stack limits have been 
exceeded. 


5.7 STACK POINTER (SP) 

The stack pointer register (SP) is a 6-bit register that indicates the location of the top of 
the system stack and the status of the stack (underflow, empty, full, and overflow condi- 
tions). The stack pointer is referenced implicitly by some instructions (DO, REP, JSR, 
RTI, etc.) or directly by the MOVEC instruction. The stack pointer register format is 
shown in Figure 5-3 and is described below. Note that the stack pointer register is imple- 
mented as a 6-bit counter which addresses (selects) a fifteen location stack with its four 
least significant bits. The possible stack values are shown in Figure 5-4 and are 
described below. 
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Stack Pointer 


Stack Error Flag 


Underflow Flag 


Figure 5-3 SP Register Format 


Table 5-1 Stack Pointer Values 
UF SE P3 P2 P1 PO CAUSE 


1 < Stack Underflow condition after double pull. 
1 < Stack Underflow condition. 

0 < Stack Empty (reset). Pull causes underflow. 
0 < stack location 1. 


< Stack location 14. 

< Stack location 15 (stack full). Push causes overflow. 
< Stack overflow condition. 

< Stack Overflow condition after double push. 


5.7.1 Stack Pointer (Bits 0,1,2,3) 
The stack pointer (SP) points to the last used place on the stack. Immediately after hard- 
ware reset these bits are cleared (SP=0), indicating that the stack is empty. 


Data is pushed onto the stack by incrementing SP by one then writing the item at stack 
location SP. An item is pulled off the stack by copying it from location SP and then decre- 
menting SP by one. 


5.7.2 Stack Error Flag - SE (Bit 4) 
The Stack Error flag (SE) indicates that a stack error has occurred and the transition of 
SE from 0 to 1 causes the priority level 3 stack error exception (see Chapter 14). 
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When the stack is completely full, the Stack pointer reads 001111, and any operation 
that pushes data to the stack will cause a stack error exception to occur and the stack 
register will read 010000 (or 010001 if an implied double push occurs). 


Any implied pull operation with SP=0 will cause a Stack Error exception (See chapter 
14), and the SP will read all ones (or 111110 if an implied double pull occurs). As shown 
in Figure 5-4, the SE bit is set. 


Note: When SP=0 (stack empty), instructions which read stack without SP post-decre- 
ment and instructions which write stack without SP pre-increment do not cause a 
stack error exception. i.e. DO SSL, xxxx; REP SSL; MOVEC or MOVEP when SSL 
is specified as a source or destination. 


5.7.3 Underflow Flag - UF (Bit 5) 
The Underflow flag (UF) is set when a stack underflow occurs. See Figure 5-4. 


When the user explicitly writes the SP register with the UF set and the SE cleared, and 
follows this operation with an implicit stack operation that increments/decrements the 
stack pointer, the Underflow flag will be cleared by the implicit operation. As long as the 
SE was not set. If the Stack Error was set, the Underflow flag will not change state (the 
“sticky” effect). In this way, when a stack error does occur, the reason for the error, 
underflow or overflow, is preserved. Some examples are given below as illustrations: 


Example 1: 
move #$20,sp ; set underflow flag, clear stack error flag 
move anything,ssh ; implicit SP increment 
move sp,x:out ; read SP, it should be $01 


In this example, the implicit SP increment cleared the Underflow flag because the Stack 
Error flag was cleared. 


Example 2: 
move #$30,sp ; set underflow flag, set stack error flag 
move anything,ssh ; implicit SP increment 
move sp,x:out ; read SP, it should be $31 


In this example, the implicit SP increment did not clear the UF because SE was set. 


Example 3: 
move #$2F,sp ; set underflow flag, clear stack error flag 
move anything,ssh ; implicit SP increment 
move sp,x:out ; read SP, it should be $10 
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In this example, the implicit SP increment produced a stack overflow error, setting Stack 
Error and clearing the Underflow flag (to show an overflow error). 


While the Stack Error flag is set, implicit SP increments/decrements will not affect the 
Underflow or Stack Error flags in any way (this is the “sticky” effect) even if decrementing 
when the 4 LSBs of SP are’0’ or incrementing when the 4 LSBs of SP are’1’. 


Example 4: 
move #$10,sp ; clear underflow flag, set stack error flag 
move ssh,destin. ; implicit SP decrement 
move sp,x:out ; read SP, it should be $1F 


In this example, the implicit SP decrement did not set the Underflow flag to denote 
underflow because the Stack Error flag was set. 


Example 5: 
move #$3F,sp ; set underflow flag, set stack error flag 
move anything,ssh ; implicit SP increment 
move sp,x:out ; read SP, it should be $30 


In this example, the implicit SP increment did not clear the Underflow flag to denote over- 
flow because the Stack Error flag was set. 


5.7.4 Unimplemented Stack Pointer Register bits 
Any unimplemented stack pointer register bits are reserved for future expansion and will 
read as zero during DSP read operations. 


5.8 OPERATING MODE REGISTER (OMR) 

The operating mode register (OMR) is a 16-bit register which defines the current chip 
operating mode of the processor. The OMR bits are only affected by processor reset and 
by instructions which directly reference the OMR. 


During processor reset the chip operating mode bits will be loaded from the external 
Mode Select pins. The operating mode register format is shown in Figure 5-4 and is 
described below. 


Note: When a bit of the OMR is changed by an instruction, a delay of one instruction cycle 
is necessary before the new mode comes into effect. 
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Clockout Disable 
Reserved 


Figure 5-4 Operating Mode Register Format 


5.8.1 Operating Mode Bits (Bits 0,1) 

The chip operating mode bits MB and MA indicate the bus expansion mode of the DSP 
when an external bus extension exists. These bits are loaded from the external Mode 
Select pins MODB and MODA respectively on processor reset. After the DSP leaves the 
RESET state, MB and MA may be changed under program control. The Operating 
Modes are shown below: 


Chip Operating Mode Comments 


Special Bootstrap 1 Bootstrap from an external byte-wide memory 
located at P:$CO00. 


Special Bootstrap 2 Bootstrap from the Host port or SSIO 


Normal Expanded Internal PRAM enabled; External reset at P:$}E0000 


Development Expanded | Int. program memory disabled; Ext. reset at P:$000. 
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5.8.2 Bus Arbitration Mode Bit (Bit 2) 

The bus operating mode bit MC indicates the bus arbitration mode of the DSP when an 
external bus extension exists. This bit is loaded from the external Mode Select pin 
MODC on processor reset. After the DSP leaves the RESET state, MC may be changed 
under program control. The Bus Operating Modes are shown below and more details are 
given in Section 7 and Section 15. 


Bus Arbitration Mode 


Slave 
Master 


5.8.3 Saturation Bit (Bit 4) 

The Saturation bit (SA), when set, selects automatic saturation on 32 bits for the results 
going to the accumulator. This saturation is done by a special saturation circuit inside the 
MAC unit. The purpose of this bit is to provide a saturation mode for 16-bit algorithms 
which do not recognize or cannot take advantage of the extension accumulator. 


The saturation logic operates by checking three bits of the 40-bit result: two bits of the 
extension byte (exp[7] and exp[0]) and one bit on the MSP (msp[15]). The result 
obtained in the accumulator when SA =1 is shown in Table 5-2: 


Table 5-2 Actions of the Saturation Mode (SA=1) 
exp[0] | msp[15] result in accumulator 


unchanged 
$00 7FFF FFFF 
$00 7FFF FFFF 


0 
: 
0 
1 $00 7FFF FFFF 


$FF 8000 0000 

$FF 8000 0000 

$FF 8000 0000 
unchanged 


This bit is cleared by processor reset. 


The scaling bits are ignored by this saturation logic and the two saturation constants 
$007FFFFFFF and $FF80000000 are not affected by the scaling mode. In the same 
way, the rounding of the saturation constant (during MPYR, MACR, RND) is independent 
of the scaling mode: $007FFFFFFF is rounded to $007FFFO000 and $FF80000000 to 
$FF80000000. 


5-14 PROGRAM CONTROL UNIT (PCU MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


| OPERATING MODE REGISTER (OMR) 


CAUTION 


The saturation mode is ALWAYS disabled during the execution of the fol- 
lowing instructions: DMACsu, DMACuu, MACsu, MACuu, MPYsu, MPYuu, 
and ASL4. The instruction ASL4 A (or B) can be followed by a MOVE A,A 
(or B,B) for proper operation when the saturation mode is turned on. How- 
ever, the “V” bit of the status register will never be set by the saturation of 
the accumulator during the MOVE A.A (or B,B). Only the “L” bit will then be 
set. If the “V” bit needs to be tested by the program, ASL4 has to be substi- 
tuted by a repetition of four ASLs. 


5.8.4 Rounding Bit (Bit 5) 
The Rounding bit (R)selects between convergent rounding and twos-complement round- 
ing. When set, two’s-complement rounding (always round up) is used. 


This bit is cleared by processor reset. 


5.8.5 Stop Delay Bit (Bit 6) 
The Stop Delay bit (SD) is used to select the delay that the DSP needs to exit the STOP 
mode. Refer to Section 7.5 for more details. 


This bit is cleared by processor reset. 


5.8.6 Clock Out Disable Bit (Bit 7) 

When the Clock out Disable bit (CD) is cleared in the OMR, a clock out signal comes out 
ofthe |CLKO pin. Setting the CD bit will disable the signal coming out of the CLKO pin 
one instruction cycle after the bit has been set. This bit can be set by the user program 
when radiation sensitive applications do not need the clock out signal. 


This bit is cleared by processor reset. 
5.8.7 Reserved Operating Mode Register Bits (Bits 3 and 8-15) 


These operating mode register bits are reserved. They will read as zero during DSP read 
operations and should be written as zero to ensure future compatibility. 
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SECTION 6 


INSTRUCTION SET AND EXECUTION 


Fetch Fi) F2 | F383 |) F8e] F4 | F5 | F6 
Decode D1 | D2 | D3 |} D3e| D4 | D5 
Execute E1 E2 | E38 | E8e} E4 


Instruction 
Cycle: 
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6.1 INTRODUCTION 

As indicated by the programming model in Chapter 5, the DSP architecture can be 
viewed as three functional units operating in parallel (Data ALU, AGU and PCU). The 
goal of the instruction set is to keep each of these units busy each instruction cycle. This 
achieves maximum speed and minimum use of program memory. 


This section introduces the DSP instruction set and instruction format. The complete 
range of instruction capabilities combined with the flexible addressing modes provide a 
very powerful assembly language for digital signal processing algorithms. The instruction 
set has also been designed to allow efficient coding for future high-level DSP language 
compilers. Execution time is enhanced by the hardware looping capabilities. 


6.2 INSTRUCTION GROUPS 
The instruction set is divided into the following groups: 


¢ Arithmetic 

* Logical 

¢ Bit Field Manipulation 
¢ Loop 

* Move 


* Program Control 
Each instruction group is described in the following sections. Detailed information on 
each instruction is given in Appendix A. 


6.2.1 Arithmetic Instructions 

The arithmetic instructions perform all of the arithmetic operations within the Data ALU. 
They may affect all of the condition code register bits. Arithmetic instructions are register- 
based (register direct addressing modes used for operands) so that the Data ALU opera- 
tion indicated by the instruction does not use the XDB or the GDB. Optional data trans- 
fers may be specified with most arithmetic instructions. This allows for parallel data 
movement over the XDB and over the GDB during a Data ALU operation. This allows 
new data to be prefetched for use in following instructions and results calculated by pre- 
vious instructions to be stored. These instructions execute in one instruction cycle. The 
following are the arithmetic instructions. 


ABS Absolute Value 
ADC Add Long with Carry 
ADD Add 
ASL Arithmetic Shift Left 
ASL4 4 Bit Arithmetic Shift Left* 
ASR Arithmetic Shift Right 
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ASR4 4 Bit Arithmetic Shift Right* 

ASR16 16 Bit Arithmetic Shift Right* 

CLR Clear an Accumulator 

CLR24 Clear 24 MSBs of an Accumulator 
CMP Compare 

CMPM Compare Magnitude 

DEC Decrement Accumulator 

DEC24 Decrement upper word of Accumulator 
DIV Divide Iteration* 

DMAC Double (Multi) precision oriented MAC* 
EXT Sign Extend Accumulator from bit 31* 
IMAC Integer Multiply-Accumulate* 

IMPY Integer Multiply* 

INC Increment Accumulator 

INC24 Increment 24 MSBs of Accumulator 
MAC Signed Multiply-Accumulate 

MACR Signed Multiply-Accumulate and Round 
MPY Signed Multiply 

MPYR Signed Multiply and Round 


MPY(su,uu) Mixed mode Multiply* 
MAC(su,uu) Mixed mode Multiply-Accumulate* 


NEG Negate 

NEGC Negate with Borrow* 

NORM Normalize* 

RND Round 

SBC Subtract Long with Carry 

SUB Subtract 

SUBL Shift Left and Subtract 

SWAP Swap MSP and LSP of an Accumulator* 

Tcc Transfer Conditionally” 

TFR Transfer Data ALU Register (Accumulator as destination) 
TFR2 Transfer Accumulator (32 bit Data Alu register as destination)* 
TST Test an accumulator 

TST2 Test an ALU data register* 

ZERO Zero Extend Accumulator from bit 31* 


“These instructions do not allow parallel data moves. 


6.2.2 Logical Instructions 

The logical instructions perform all of the logical operations within the Data ALU. They 
may affect all of the condition code register bits. Logical instructions are register-based 
as are the arithmetic instructions above. Optional data transfers may be specified with 
most logical instructions. This allows for parallel data movement over the XDB and over 
the GDB during a Data ALU operation. This allows new data to be prefetched for use in 
following instructions and results calculated in previous instructions to be stored. With 
the exceptions of ANDI or ORI the destination of all logical instructions is A1 or B1. 
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These instructions execute in one instruction cycle. The following are the logical instruc- 


tions. 
AND Logical AND 
ANDI AND Immediate Program Controller Register* 
EOR Logical Exclusive OR 
LSL Logical Shift Left 
LSR Logical Shift Right 
NOT Logical Complement 
OR Logical Inclusive OR 
ORI OR Immediate Program Controller Register* 
ROL Rotate Left 
ROR Rotate Right 


“These instructions do not allow parallel data moves. 


6.2.3 Bit Field Manipulation Instructions 

This group tests the state of any set of bits within a byte in a memory location or a regis- 
ter and then sets, clears, or inverts bits in this byte. Bit fields which can be tested include 
the upper byte and the lower byte in a 16 bit value. The carry bit of the condition code 
register will contain the result of the bit test for each instruction. These instructions are 
read-modify-write type operations and require two instruction cycles. The following are 
the bit field manipulation instructions. 


BFTSTL Bit Field Test Low 
BFTSTH Bit Field Test High 
BFCLR Bit Field Test and Clear 
BFSET Bit Field Test and Set 
BFCHG Bit Field Test and Change 


6.2.4 Loop Instructions 

The loop instructions control hardware looping by initiating a program loop and setting up 
looping parameters, or by “cleaning” up the system stack when terminating a loop. Initial- 
ization includes saving registers used by a program loop (LA and LC) on the system 
stack so that program loops can be nested. The address of the first instruction in a pro- 
gram loop is also saved to allow no-overhead looping. The end address of the DO loop is 
specified as PC relative. The following are the loop instructions. 


DO Start Hardware Loop 
DO FOREVER _ Hardware Loop for ever 
ENDDO Disable Current Loop and Unstack Parameters 
BRKcc Conditional Exit from Hardware Loop 
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6.2.5 Move Instructions 

The move instructions perform data movement over the XDB and over the GDB. Move 
instructions do not affect the condition code register except the limit bit L if limiting is per- 
formed when reading a Data ALU accumulator register. AGU instructions are also 
included among the following move instructions. These instructions do not allow optional 
data transfers. In addition to the following move instructions, there are parallel moves 
which can be used simultaneously with many of the other instructions. 


LEA Load Effective Address 
MOVE Move Data with or without register transfer — TFR(3) 
MOVE(C) Move Control Register 
MOVE(I) Move Immediate Short 
MOVE(M) Move Program Memory 
(P) Move Peripheral Data 
(S) Move Absolute Short 


6.2.6 Program Control Instructions 

The program control instructions include branches, jumps, conditional branches and 
jumps and other instructions which affect the PC and system stack. Program control 
instructions may affect the condition code register bits as specified in the instruction. 
The following are the program control instructions. 


Bcc Branch Conditionally 

BSR Branch to Subroutine (PC relative) 
BRA Branch 

BScc Branch to Subroutine Conditionally 


DEBUG Enter Debug Mode 
DEBUGcc Enter Debug Mode Conditionally 


Jcc Jump Conditionally 

JMP Jump 

JSR Jump to Subroutine 

JScc Jump to Subroutine Conditionally 
NOP No Operation 

REP Repeat Next Instruction 

REPcc Repeat Next Instruction Conditionally 
RESET Reset Peripheral Devices 

RTI Return from Interrupt 

RTS Return from Subroutine 

STOP Stop Processing (low power stand-by) 
SWI Software Interrupt 

WAIT Wait for Interrupt (low power stand-by) 


6.3 INSTRUCTION FORMATS 

Instructions are one or two words in length. The instruction and its length are specified by 
the first word of the instruction. The next word may contain information about the instruc- 
tion itself or about an operand for the instruction. The assembly language source code 
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for a typical one word instruction is shown below. The source code is organized into four 
columns. 


Opcode Operands X Bus Data G Bus Data 
MAC X0,Y0,A X:(RO)+,X0 X:(R3)+,Y0 


The Opcode column indicates the Data ALU, AGU, or PCU operation to be performed. 
The Operands column specifies the operands to be used by the opcode. The X Bus Data 
and G Bus Data columns specify optional data transfers over the X Bus and the address- 
ing modes to be used. The Opcode column must always be included in the source code. 


The DSP offers parallel processing using the Data ALU, AGU and PCU. For the instruc- 
tion word above, the DSP will perform the designated ALU operation (Data ALU), up to 
two data transfers specified with address register updates (AGU), and will also decode 
the next instruction and fetch an instruction from program memory (PCU) all in one 
instruction cycle. When an instruction is more than one word in length, an additional 
instruction execution cycle is required. Most instructions involving the Data ALU are reg- 
ister-based (all operands are in Data ALU registers) and allow the programmer to keep 
each parallel processing unit busy. An instruction which is memory-oriented (such as a 
bit field manipulation instruction) or that causes a control flow change (such as a branch/ 
jump) prevents the use of parallel processing resources during its execution. 


6.4 INSTRUCTION EXECUTION 

Instruction execution is pipelined to allow most instructions to execute at a rate of one 
instruction every clock cycle. However, certain instructions will require additional time to 
execute. These include instructions which are longer than one word, instructions which 
use an addressing mode that requires more than one cycle, instructions which make use 
of the global data bus more than once, and instructions which cause a control flow 
change. In the latter case a cycle is needed to clear the pipeline. 


6.4.1 Instruction Processing 

Pipelining allows the fetch-decode-execute operations of an instruction to occur during 
the fetch-decode-execute operations of other instructions. While an instruction is exe- 
cuted, the next instruction to be executed is decoded, and the instruction to follow the 
instruction being decoded is fetched from program memory. If an instruction is two words 
in length, the additional word will be fetched before the next instruction is fetched. The 
illustration below demonstrates pipelining; F1, D1 and E1 refer to the fetch, decode and 
execute operations, respectively, of the first instruction. Note, the third instruction con- 
tains an instruction extension word and takes two cycles to execute. 
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Fi | F2 | F3 | F3e| F4 | F5 | F6 
Di | D2 | D3 | D3e| D4 | DS 


Instruction 
Cycle: 1 2 3 4 5 6 7 


Figure 6-1 Instruction Pipelining 


Each instruction requires a minimum of 12 clock phases to be fetched, decoded, and 
executed. A new instruction may be started after four phases. Two word instructions 
require a minimum of 16 phases to execute and a new instruction may start after eight 
phases. 


6.4.2 Memory Access Processing 

One or more of the DSP memory sources (X data memory and program memory) may 
be accessed during the execution of an instruction. Each of these memory sources may 
be internal or external to the DSP. These address buses (XA1, XA2, and PAB) and three 
data buses (XD, program data, and Global Data) are available for internal memory 
accesses during one instruction cycle but only one address bus and one data bus are 
available for external memory accesses (when an external bus is available). If all mem- 
ory sources are internal to the DSP, one or more of the two memory sources may be 
accessed in one instruction cycle (i.e., program memory access or program memory 
access plus an X memory reference). However, when one or more of the memories are 
external to the DSP, memory references may require additional instruction cycles. With 
internal program memory and one internal data memory, memory references will not 
require any additional instruction cycles (i.e. X memory references will take one instruc- 
tion cycle). When program memory is external and the data memory is internal, no addi- 
tional instruction cycles are required for all types of operand references. If the data 
memory is also external, an additional cycle is necessary when the external data mem- 
ory is accessed (i.e., when X memory references are specified). If each memory source 
is external to the DSP, one additional cycle is required when one data memory is 
accessed i.e., when a X memory reference is specified). 
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7.1. INTRODUCTION 
The DSP56100 family is always in one of five processing states: normal, exception, 
reset, wait, and stop. These states are described in the following paragraphs. 


7.2 | NORMAL PROCESSING STATE 

The normal processing state is associated with instruction execution. Details on normal 
processing of the individual instructions can be found in Appendix A. Instructions are 
executed using a three stage pipeline which is described in the following paragraphs. 


7.2.1 Instruction Pipeline 

The 16-bit DSP instruction execution is performed in a three level pipeline allowing most 
instructions to execute at a rate of one instruction every instruction cycle. However, cer- 
tain instructions will require additional time to execute. These include instructions which 
are longer than one word, instructions which use an addressing mode that requires more 
than one cycle, and instructions which cause a control flow change. In the latter case a 
cycle is needed to clear the pipeline. 


Instruction pipelining allows overlapping the execution of instructions such that the fetch- 
decode-execute operations of a given instruction occurs concurrently with the fetch- 
decode-execute operations of other instructions. Specifically, while an instruction is exe- 
cuted, the next instruction to be executed is decoded, and the instruction to follow the 
instruction being decoded is fetched from program memory. Only one word is fetched 
per cycle so that if an instruction is two words in length, the additional word will be 
fetched before the next instruction is fetched. Figure 7-1 demonstrates pipelining. F1, 
D1, and E1 refer to the fetch, decode, and execute operations, respectively, of the first 
instruction. The third instruction contains an instruction extension word and takes two 
instruction cycles to execute. Although it takes three instruction cycles for the pipeline to 
fill and the first instruction to execute, an instruction usually executes on each instruction 
cycle thereafter. 


Summarizing; each instruction requires a minimum of 3 instruction cycles (12 clock 
phases) to be fetched, decoded, and executed. This means that there is a delay of three 
instruction cycles on power up to fill the pipe. A new instruction may be started immedi- 
ately following the previous instruction. Two word instructions require a minimum of four 
instruction cycles to execute (three cycles for the first instruction word to move through 
the pipe and execute and one more for the second word to execute) and a new instruc- 
tion may start after the second cycle of the preceding instruction. 
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Instruction 
Cycle 


Fetch F3 F3e F4 F5 
Decode D2 D3 D3e D4 
Execute E1 E2 E3 E8e 


Figure 7-1 Instruction Pipelining 


The pipeline is normally transparent to the user. However, it will affect program execution 
in some situations. These situations are instruction sequence dependent and are best 
described by case studies. Most of these restricted sequences occur because (1) all 
addresses are formed during instruction decode or (2) contention for an internal resource 
such as the status register (SR) occurs. If the execution of an instruction depends on the 
relative location of the instruction in a sequence of instructions, there is a pipeline effect. 
To test for a suspected pipeline effect, compare between the execution of the suspect 
instruction (1) when it directly follows the previous instruction and (2) when four NOPs 
are inserted between the two. If there is a difference, it is due to a pipeline effect. The 16- 
bit DSP assembler is designed to flag instruction sequences with potential pipeline 
effects so that the user can decide if the operation will be as expected. 


Case 1: The following two examples show similar code sequences, the first with no pipe- 
line effect and the second with a pipeline effect. 


1) No pipeline effect: 


ORI #xx,CCR ;Changes CCR at the end of execution time slot 
Jcc XXXX ;Reads condition codes in SR in its execution time slot 


The Jcc will test the bits modified by the ORI without any pipeline effect in the code seg- 
ment above. 


2) Instruction which started execution during decode: 


ORI #03,OMR ;Sets MA, MB bits at execution time slot 
MOVE x:$100,a ;Reads internal RAM instead of external RAM 


There is a pipeline effect in example 2 because the address of the move is formed at its 
decode time before the ORI changes the MA and MB bits (which change the memory 
map) in the ORI’s execution time slot. The following code produces the expected results 
of reading the external RAM: 


ORI #03,OMR_ ;Sets MA, MB bits at execution time slot 
NOP ;Delays the MOVE so it will read the updated OMR 
MOVE x:$100,a ;Reads external RAM 


Case 2: One of the more common sequences where pipeline effects are apparent is: 
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MOVE #xxxx,Rn_ —;Move a number into register Rn (n=0-7). 
MOVE X:(Rn),A — ;Use the new contents of Rn to address memory. 


In this case, before the first MOVE instruction has written Rn during its execution cycle, 
the second MOVE has accessed the old Rn and therefore will use the old contents of Rn. 
This is because the address for indirect moves is formed during the decode cycle. This 
overlapping instruction execution in the pipeline causes the pipeline effect. One instruc- 
tion cycle should be allowed after a register has been written by a MOVE instruction 
before the new contents are available for use by another MOVE instruction. The proper 
instruction sequence is: 


MOVE X0,Rn ;Move a number into register Rn. 
NOP ;Execute any instruction or instruction sequence not using Rn 
MOVE X:(Rn),A ;Use the new contents of Rn. 


Case 3: A situation related to Case 2 can be seen in the boot ROM program. At the end 
of the bootstrap operation, the OMR is changed to Mode #2 and then the program that 
was loaded is executed. This process is accomplished in the last three instructions which 
are shown below: 


_BOOTEND 

MOVEC #2,OMR __ ; Set the operating mode to 2 
; (and trigger an exit from 
; bootstrap mode). 

ANDI #$0,CCR__; Clear SR as if RESET and 
; introduce delay needed for 
; Op. Mode change. 

JMP <$0 ; Start fetching from PRAM, P:$0000 


The JMP instruction generates its jump address during its decode cycle. If the JMP 
instruction followed the MOVEC, the MOVEC instruction would not have changed the 
OMR before the JMP instruction formed the fetch address. As a result, the jump would 
fetch the instruction at P:$0000 of the bootstrap ROM (MOVE #$FFCO,R2). The OMR 
would then change due to the MOVEC instruction and the next instruction would be the 
second instruction of the downloaded code at P:$0001 of the internal RAM. However, the 
ANDI instruction allows the OMR to be changed before the JMP instruction uses it and 
the JMP fetches P:$0000 of the internal RAM as intended. 


Case 4: An interrupt has two additional control cycles which are executed in the interrupt 
controller concurrently with the fetch, decode, and execute cycles (see Section 7.3 


MOTOROLA PROCESSING STATES 7-5 


For More Information On This Product 
Go to: www.freescale.com 


| NORMAL PROCESSING STATE 


“Exception Processing” and Figure 7-2). During these two control cycles, the interrupt is 
arbitrated by comparing the interrupt mask level with the interrupt priority level (IPL) of 
the interrupt and either allowing or disallowing the interrupt. Therefore, if the interrupt 
mask is changed after an interrupt is arbitrated and accepted as pending but before the 
interrupt is executed, the interrupt will be executed regardless of what the mask was 
changed to. The following examples show that the old interrupt mask is in effect for 
up to four additional instruction cycles after the interrupt mask is changed. Note 
that all instructions shown in the examples here are one word instructions; however, one 
two-word instruction can replace two one-word instructions except where noted. 


Program flow with no interrupts after interrupts are disabled: 


ORI #03,MR_ __;disable interrupts 
INST 1 
INST 2 
INST 3 
INST 4 


Possible variations in program flow which may occur after interrupts are disabled: 


ORI #03,MR ORI #03,MR ORI #03,MR ORI #03,MR 

I] INST 1 INST 1 INST 1 

+1 H] INST 2 INST 2 

INST 1 +1 Il INST 3 < See note 1 
INST 2 INST 2 +1 lI 

INST 3 INST 3 INST 3 +1 


Note 1: INST 3 may be executed at that point only if the preceding instruction (INST 2) 
was a single-word instruction. 


Note 2: Il = Interrupt Instruction from maskable interrupt. 


The following program flow WILL NOT occur because the ORI instruction becomes 
effective after a pipeline latency of four instruction cycles: 
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ORI #03,MR_ ; Disable interrupts. 


INST 1 
INST 2 
INST 3 
INST 4 
H ; Interrupts disabled. 
+14 ; Interrupts disabled. 


Program flow without interrupts after interrupts are re-enabled: 


ANDI #00,MR ;enable interrupts 
INST 1 
INST 2 
INST 3 
INST 4 


Program flow with interrupts after interrupts are re-enabled: 


ANDI #00,MR ;Enable interrupts 
INST 1 ;Uninterruptable 
INST 2 ;Uninterruptable 
INST 3 Il fetched 

INST 4 ll+1 fetched 

II 

+1 


The DO instruction is another instruction which begins execution during the decode cycle 
of the pipeline. As a result, there are a number of restrictions concerning access conten- 
tion with the program controller registers which are accessed by the DO instruction. The 
ENDDO instruction has similar restrictions. Appendix A contains additional information 
on the DO and ENDDO instruction restrictions. 


Case 5: A resource contention problem can occur when one instruction is using a regis- 
ter during its decode while the instruction executing is accessing the same resource. 
One example of this is: 


MOVEC X:$100,SSH 
DO #$10,END 
The problem occurs because the MOVEC instruction loads the contents of X:$100 into 
the SSH during T3 of its execute cycle. The DO instruction that follows pushes the stack 
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(LA — SSH, LC + SSL) during T3 of its decode cycle. Therefore the two instructions try 
writing to the SSH simultaneously and conflict. 


7.2.2. Summary of Pipeline Related Restrictions 

A summary of the instruction sequences that cause pipeline effects is given in the follow- 
ing paragraphs. Additional information concerning the individual instructions can be 
found in Appendix A. 


7.2.2.1. DO Instruction Restrictions 
The DO instruction must not be immediately preceded by any of the following instruc- 
tions: 


* BFCHG/BFCLR/BFSET LA, LC, SSH, SSL or SP 
* MOVEC/MOVEM to LA, LC, SSH, SSL or SP 
* MOVEC/MOVEM from SSH 


7.2.2.2 Restrictions Near the End of DO Loops 

Proper DO loop operation is guaranteed if no instruction starting at address LA-2, LA-1 
or LA specifies the program controller registers SR, SP, SSL, LA, LC or (implicitly) PC as 
a destination register; or specifies SSH as a source or destination register. Also, SSH 
can not be specified as a source register in the DO instruction itself. 


These restricted instructions include: 


- at LA-2, LA-1 and LA: 


« DO 

¢ BFCHG/BFCLR/BFSET LA, LC, SR, SP, SSH, or SSL 

¢* BFTST SSH 

* MOVEC/MOVEM/MOVEP from SSH 

* MOVEC/MOVEM/MOVEP to LA, LC, SR, SP, SSH, or SSL 
* ANDI/ORI MR 


* any two word instruction 

« Jcc, Bcc, JMP, BRA, JScc, BScc, JSR, BSR 

¢ REP, RESET, RTI, RTS, STOP, WAIT 
Other restrictions: 


* DO SSH,xxxx 
¢ JSR/JScc/BSR/BScc_ to (LA), if Loop Flag is set 
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7.2.2.3 ENDDO Instruction Restrictions 
The ENDDO instruction must not be immediately preceded by any of the following 
instructions: 


* BFCHG/BFCLR/BFSET LA, LC, SR, SSH, SSL or SP 
* MOVEC/MOVEM to LA, LC, SR, SSH, SSL or SP 

* MOVEC/MOVEM from SSH 

* ANDI/ORI MR 


7.2.2.4 RTl and RTS Instruction Restrictions 
The RTI instruction must not be immediately preceded by any of the following instruc- 
tions: 


* BFCHG/BFCLR/BFSET SR, SSH, SSL or SP 

* MOVEC/MOVEM to SR, SSH, SSL or SP 

¢ MOVEC/MOVEM from SSH 

¢ ANDI MR, ANDI CCR 

* ORIMR, ORI CCR 
The RTS instruction must not be immediately preceded by any of the following instruc- 
tions: 


¢ BFCHG/BFCLR/BFSET SSH, SSL or SP 
* MOVEC/MOVEM to SSH, SSL or SP 
* MOVEC/MOVEM from SSH 


7.2.2.5 SP and SSH/SSL Register Manipulation Restrictions 
In addition to all the above restrictions concerning SP, SSH, and SSL, the following 
instruction sequences are illegal: 


¢ BFCHG/BFCLR/BFSET SP 
¢ MOVEC/MOVEM/MOVEP from SSH or SSL 
and 


* MOVEC/MOVEM to SP 
* MOVEC/MOVEM/MOVEP from SSH or SSL 
Also the instruction MOVEC SSH,SSH is illegal. 


7.2.2.6 Rn, Nn, and Mn Register Restrictions 
If an address register (RO-R3, NO-N3, or MO-M3) is changed with a move type instruction 
(LUA, Tcc, MOVE, MOVEM, MOVEC or parallel move), the new contents will not be 
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available for use as a pointer until the second following instruction. This restriction does 
not apply to registers updated as part of an indirect addressing mode. 


7.2.2.7 Fast Interrupt Routine Restrictions 
BRKcc, DO, SWI, STOP, and WAIT may not be used in a fast interrupt routine. 


7.3. EXCEPTION PROCESSING (INTERRUPT PROCESSING) 

Exception processing in a digital signal processing environment is primarily associated 
with transfer of data between DSP memory or registers and a peripheral device. When 
an interrupt occurs, a limited context switch must be performed with minimum overhead. 


When a hardware interrupt is received, it is synchronized on instruction boundaries so 
that the first two interrupt instruction words can be inserted into the instruction stream. 
Suppose that the interrupt is stored in the interrupt pending latch during the current 
instruction fetch cycle. During the next cycle, which is the decode cycle of the current 
instruction, the PC will be updated to fetch the next instruction. However, in the following 
cycle, which is the execution cycle of the current instruction, the address placed on the 
program address bus (PAB) comes from the appropriate interrupt start address, rather 
than from the PC. Note that the PC is frozen until exception processing terminates. 


Figure 7-2 illustrates the effect of the interrupt controller, which is simply to insert two 
instruction words into the processor’s instruction stream. 


The following one-word instructions are aborted when they are fetched in the cycle pre- 
ceding the fetch of the first interrupt instruction word — REP, REPcc, BRKcc, STOP, 
WAIT, RESET, RTI, RTS, Jcc, Bcc, JMP, BRA, BScc, JScc, JSR, and BSR. 


Two-word instructions are aborted when the first interrupt instruction word fetched will 
replace the fetch of the second word of the two word instruction. Aborted instructions are 
re-fetched again when program control returns from the interrupt routine. The PC is 
adjusted appropriately prior to the end of the decode cycle of the aborted instruction. 


If the first interrupt word fetch occurs in the cycle following the fetch of a one-word 
instruction not listed above or the second word of a two-word instruction, that instruction 
will complete normally prior to the start of the interrupt routine. 
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Int. Cir cyc1 i* 
Int. Ctr cyc2 
Fetch n3 n4 ii ii2 n5 n6 n7 ng ii3 ii4 
Decode n2 n3 n4 ii ii2 n5 n6 n7 ng ii3 ii4 
Execute n1 n2 n3 n4 iit ii2 nd n6 n7 ng ii3 
Instruction 
decode Order| 1 2 3 4 5 6 7 8 9 10 11 


i = interrupt request 

ii = interrupt instruction word 

n = normal instruction word 

* subsequent interrupts are enabled at this time 


Figure 7-2 Interrupt Pipeline Action 


The following cases have been identified where service of an interrupt might encounter 
an extra delay: 


1. If along interrupt routine is used to service an SWI then the processor priority 
level is set to 3. Thus, all interrupts except for other level three interrupts are 
disabled until the SWI service routine terminates with an RTI (unless the SWI 
service routine software lowers the processor priority level). 


2. While servicing an interrupt, the next interrupt service will be delayed according 
to the following rule: 


After the first interrupt instruction word reaches the instruction decoder, at least 
three more instructions will be decoded before decoding the next first interrupt 
instruction word. If any one pair of instructions being counted is the REP in- 
struction followed by an instruction to be repeated then the combination is 
counted as two instructions independently of the number of repeats done. 


Sequential REP combinations will cause pending interrupts to be rejected and 
can not be interrupted until the sequence of REP combinations ends. 


3. The following instructions are not interruptable: BRKcc, SWI, STOP, WAIT, and 
RESET. 


4. The REP and REPcc instructions and the instruction being repeated are not in- 
terruptable. 
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5. Instructions using a Read-Modify-Write bus access cannot be interrupted dur- 
ing their bus access. 


During an interrupt instruction fetch, two instruction words are fetched, the first from the 
interrupt starting address and the second from the interrupt starting address +1 loca- 
tions. 


7.3.1 Interrupt Types 

Two types of interrupt routines may be used: fast and long. The fast routine consists of 
the two automatically inserted interrupt instruction words. These words can contain any 
un-restricted single two-word instruction or any two one-word instructions (see Appendix 
A - section A.8 “Instruction Sequence Restrictions” for a list of restrictions). Fast interrupt 
routines are never interruptable. 


CAUTION 


Status is not preserved during a fast interrupt routine; therefore, instructions 
which modify status should not be used at the interrupt starting address and 
interrupt starting address +1. 


If one of the instructions in the fast routine is a jump or branch to subroutine, then a long 
interrupt routine is formed. The long interrupt routine should be terminated by an RTI. 
Long interrupt routines are interruptable by higher priority interrupts. 


7.3.2 Interrupt Arbitration 

External interrupts are internally synchronized with the processor clock (this takes up to 
three T cycles) before their interrupt pending flags are set. Each separate external inter- 
rupt and internal interrupt has its own independent flag. After each instruction is exe- 
cuted in normal processing mode, all interrupts are arbitrated. This includes all hardware 
interrupts that have been latched into their respective interrupt pending flags and all 
internal interrupts. During arbitration, each interrupt’s IPL is compared with the interrupt 
mask in the SR and the interrupt is either allowed or disallowed. The remaining interrupts 
are prioritized according to the priority shown in Table 7-5 and the highest priority inter- 
rupt is chosen. The interrupt vector is then calculated so that the Program Interrupt Con- 
troller can fetch the first interrupt instruction. Interrupt arbitration and control occurs 
concurrently with the fetch-decode-execute cycle and takes two instruction cycles. Inter- 
rupts from a given source are not buffered. The interrupt pending flag for the chosen 
interrupt is not cleared until the second interrupt vector of the chosen interrupt is being 
fetched. A new interrupt from the same source will not be accepted for the next interrupt 
arbitration until that time. 
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The internal “interrupt acknowledge” signal is used to clear the edge-triggered interrupts’ 
flags, the Stack Error, Illegal Interrupt and SWI. Peripheral interrupt requests that need a 
read/write action to some register DO NOT receive this signal, and those interrupts will 
remain pending until their registers are read/written. Also, level-triggered interrupts will 
not be cleared. Note that the acknowledge signal will be generated after generation of 
the interrupt vectors, and not before. 


However, the first instruction word of the next interrupt service will reach the decoder 
only after the decoding of at least four instructions following the decoding of the first 
instruction of the previous interrupt. 


7.3.3 Interrupt Instruction Fetch 

The interrupt controller generates an interrupt instruction fetch address which points to 
the first instruction word of a two-word fast interrupt routine. This address is used for the 
next instruction fetch, instead of the PC, and the interrupt instruction fetch address + 1 is 
used for the subsequent instruction fetch. While the interrupt instructions are being 
fetched, the PC is inhibited from being updated. After the two interrupt words have been 
fetched, the PC is used for any following instruction fetches. 


After the interrupt instructions have been fetched, they are guaranteed to be executed. 
This is true even if the instruction that is currently being executed is a change of flow 
instruction (i.e., JMP, JSR, etc.) that would normally ignore the instructions in the pipe. 
After the interrupt instruction fetch, the PC will point to the instruction that would have 
been fetched if the interrupt instructions had not been substituted. 


7.3.4 Interrupt Instruction Execution 

Interrupt instruction execution is considered to be “fast” if neither of the instructions of the 
interrupt service routine causes a change of flow. A jump or branch to subroutine within a 
fast interrupt routine forms a long interrupt which is terminated with an RTI instruction to 
restore the PC and SR from the stack and return to normal program execution. Reset is 
a special exception which will normally contain only a JMP instruction at the exception 
start address. At the programmer’s option, almost any instruction can be used in the fast 
interrupt routine. The restricted instructions include SWI, STOP, and WAIT. Figure 7-3, 
Figure 7-4, Figure 7-5 show the fast and the long interrupt service routines. Notice that 
the fast interrupt executes only two instructions and then automatically resumes execu- 
tion of the main program where it left off wnereas the long interrupt must be told to return 
to the main program by executing an RTI instruction. 


7.3.4.1 Fast Interrupt 


Figure 7-3 illustrates the effect of a fast interrupt routine in the stream of instruction 
fetches. 
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Figure 7-4 shows the sequence of instruction fetches between two fast interrupts. Note 
that there is a total of four fetches between the two interrupt fetches (two after the first 
interrupt and two preceding the second interrupt). The requirement for these four fetches 
establishes the maximum rate at which the DSP will respond to interrupts, namely one 
interrupt every six instructions. 


Int. Ctr cyc1 i i* 
Int. Ctr cyc2 i i 
Fetch n3 n4 ii ii2 nd n6 n7 ng ii3 ii4 
Decode n2 n3 n4 f1 f2 nd n6 n7 ng {3 4 
Execute n1 n2 n3 n4 f1 f2 nd n6 n7 ng {3 
Instruction 
decode Order| 1 2 3 4 5 6 7 8 9 10 11 


f = fast interrupt instruction word (non-control-flow-change) 
i = interrupt request 

ii = interrupt instruction word 

n = normal instruction word 

* subsequent interrupts are enabled at this time 


Figure 7-3 Fast Interrupt Pipeline Action 


The sequence: 


REP #N 
Instruction 


is counted as 2 instructions regardless the value of N. 
Execution of a fast interrupt routine always follows the following rules: 


1. No JSR or BSR located at either of the two interrupt vector addresses. If Jscc 
or Bscc are used, the interrupt remains a fast interrupt if the condition is false. 


2. The processor status is not saved. 


3. The fast interrupt routine may (but should not) modify the status of the normal 
instruction stream. 


4. The fast interrupt routine may contain any single two-word instruction or any 
two one-word instructions except SWI, STOP, and WAIT. 


7-14 PROCESSING STATES MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


NP 


EXCEPTION PROCESSING (INTERRUPT PROCESSING) 


5. The PC, which contains the address of the next instruction to be executed in 
normal processing, remains unchanged during a fast interrupt routine. 


6. The fast interrupt returns without an RTI. 


7. Normal instruction fetching resumes using the PC following the completion of 
the fast interrupt routine. 


8. A fast interrupt is not interruptable. 


9. The primary application is to move data between memory and I/O devices. 


7.3.4.2 Long Interrupt 
A jump to subroutine instruction within the fast interrupt routine forms a long interrupt 
routine. Execution of a long interrupt routine always follows the following rules: 


1. AJSR, BSR, JScc or BScc with true condition to the starting address of the in- 
terrupt service routine is located at one of the two interrupt vector addresses. 


2. During execution of the jump to subroutine instruction, the PC and SR are 
stacked. The interrupt mask bits of the SR are updated to mask interrupts of the 
same or lower priority. The Loop Flag and Scaling Mode bits are reset. 


3. The first instruction word of the next interrupt service (of higher IPL) will reach 
the decoder only after the decoding of at least four instructions following the de- 
coding of the first instruction of the previous interrupt. 


4. The interrupt service routine can be interrupted i.e., nested interrupts are sup- 
ported. 


5. The long interrupt routine can be any length and should be terminated by an 
RTI, which restores the PC and SR from the stack. 


Figure 7-4 illustrates the effect of a long interrupt routine on the instruction pipeline. A 
short JSR (that is, a JSR with 8-bit absolute address) is used to form the long interrupt 
routine. For this example, word 4 of the long interrupt routine is an RTI. A subsequent 
interrupt is shown to illustrate the non-interruptible nature of the early instructions in the 
long interrupt service routine. In this example, the interrupts are reenabled, not because 
sr4 was an RTI, but because it was the fourth instruction decoded after ii1 was decoded 
and found to be a JSR instruction. 


Either one of the two instructions of the fast interrupt can be the JSR instruction that 
forms the long interrupt. Notice that if the first fast interrupt vector instruction is a short 
JSR, the second instruction is never used. 
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7.3.4.3. Case of the REP Instruction 

A REP instruction is treated as a single two-word instruction regardless of how many 
times it repeats the second instruction of the pair. Instruction fetches are suspended and 
will be reactivated only after the loop counter is decremented to one 


See Figure 7-5 for an example of interrupt service when the instruction that receives the 
internal interrupt service request is the REP instruction (n3 in Figure 7-5). During the 
repeated executions of the instruction that follows the REP instruction (n4), instruction 
fetches are suspended. The fetches will be reactivated only after the loop counter is dec- 
remented to one. During the execution of n4, no interrupts will be serviced. When LC 
finally reaches one, the fetches are reinitiated and the interrupt can be serviced. In Fig- 
ure 7-5 it can be seen that n5 (loaded into the instruction latch from the backup instruc- 
tion latch) is decoded and executed as well as n6 before the first interrupt vector. 


Sequential REP operations will cause pending interrupts to be rejected and can not be 
interrupted until the sequence of REP operations ends. The reason that REP operations 
are not interruptable is that the instruction being repeated is not refetched. While that 
instruction is repeating, no instructions are fetched or decoded and an interrupt can not 
be inserted. 


7.3.5 Interrupt Sources 

Exceptions may originate from any of the 32 vector addresses listed in Table 7-1 The 
corresponding interrupt starting addresses for each interrupt source are shown. Interrupt 
starting addresses are internally-generated addresses which point to the first instruction 
of the fast interrupt service routine. The interrupt starting address for each interrupt is an 
address constant for minimum overhead. Thirty-two interrupt starting address locations 
are provided. These addresses are located in the first 64 locations of program memory. 
When an interrupt is serviced, the instruction at the interrupt starting address is fetched 
first. If it is Known a priori that certain interrupts will not be used, those interrupt vector 
locations can be used for program or data storage. 


The 32 interrupts are prioritized into four levels. Level 3 is the highest priority level and is 
not maskable. Levels 0-2 are maskable. The interrupts within each level are prioritized 
according to a predefined priority that is discussed in the next sub-section. The level 
three interrupts - Reset, Illegal Instruction, Stack Error and SWI, are discussed individu- 
ally. 


7.3.5.1. Hardware Interrupt Sources 

There are two types of hardware interrupts in the DSP: internal and external. The internal 
interrupts include all of the on-chip peripheral devices (Host Interface, SSIs and Timer). 
Each internal interrupt source is latched and serviced if it is not masked. When it is ser- 
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Int. Ctr cyc1 i 
Int. Ctr cyc2 i i 
Fetch n3 n4 iit ii2 sr1 sr2 sr3 nd iit 
Decode n2 n3 n4 JSRf = sr1 sr2 = nd 
Execute n1 n2 n3 n4 JSRf | NOP sr1 RT| |NOP 
Instruction 
decode Order| 1 2 3 4 5 6 
instruction after the RTI is always fetched but not 
decoded when RTI has been recognized 
Int. Ctr cyc1 i* 
Int. Ctr cyc2 i 
Fetch n5 ii ii2 n6 n7 ng ng 
Decode RTI - n5 ii ii2 n6 n7 ng 
Execute sr3 RT| |NOP n5 ii ii2 n6 n7 
Instruction 
decode Order | 8 9 19 11 12 13 14 


i = interrupt request 

ii = interrupt instruction word 

JSRf = fast JSR (JSR with 8-bit absolute address) 
n = normal instruction word 
sr = service routine word 

* subsequent interrupts are enabled at this time 


Figure 7-4 Long Interrupt Pipeline Action 
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Int. Ctr cyct i i 
Int. Ctr cyc2 i% i* 

Fetch n3 n4 nd n6 ii4 ii2 n7 ng ng 
Decode n2 REP —- n4 n4 n5 n6 iit ii2 n7 ng 
Execute ni n2 REP |NOP n4 n4 nd5 n6 ii ii2 n7 

Instruction 
decode Order| 1 2 3 4 5 6 7 8 9 10 11 


i = interrupt request 

ii = interrupt instruction word 

n = normal instruction word 

n3 = REP #2 instruction 

n4 = instruction being repeated twice 

n5 = instruction that waits in the backup instruction latch 
% interrupt rejected at this time 

* subsequent interrupts are enabled at this time 


Figure 7-5 
Example of Interrupt Service when 
Interrupt is Presented to REP Instruction 


viced, the interrupt is cleared. Each internal hardware source has independent enable 
control and priority level control. 


The external hardware interrupts include RESET, IRQA, and IRQB. The RESET interrupt 
is level sensitive and is the highest level interrupt (priority 3). The IRQA and IRQB inter- 
rupts can be programmed to be level sensitive or edge sensitive. The level sensitive 
interrupts will not be cleared automatically when they are serviced and therefore must be 
cleared by other means to prevent multiple interrupts. The edge sensitive interrupts are 
latched as pending on the high-to-low transition of the interrupt input and automatically 
cleared when the interrupt is serviced. IRQA and IRQB interrupts can be programmed to 
one of three maskable priority levels: level 0, 1, or 2. Additionally, both of these interrupts 
have independent enable control. 


When the IRQA or IRQB interrupts are disabled in the IPR register, the pending request 
will be ignored regardless of whether the interrupt input was defined as level sensitive or 
edge sensitive. If the interrupt is defined as edge sensitive, its edge detection latch will 
remain in the reset state as long as (1) the interrupt is disabled or (2) if the interrupt is 
defined as level sensitive. If the level sensitive interrupt is disabled while the interrupt is 
pending, the pending interrupt will be cancelled. However, if the first instruction of the 
interrupt has been fetched, it will not be cancelled. 
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Table 7-1 Interrupt Sources 


Interrupt 
Starting 
Address Interrupt Source 


$0000 Hardware RESET 

$0002 Illegal Instruction 

$0004 Stack Error 

$0006 Reserved 

$0008 SWI 

$000A IRQA 

$000C IRQB 

$000E Reserved 

$0010 SSIO Receive Data with Exception Status 
$0012 SSIO Receive Data 

$0014 SSl0Transmit Data with Exception Status 
$0016 SSI0 Transmit Data 

$0018 SSI1 Receive Data with Exception Status 
$001A SSI1 Receive Data 

$001C SSI1 Transmit Data with Exception Status 
$001E SSI1 Transmit Data 

$0020 Timer Overflow 

$0022 Timer Compare 

$0024 Host DMA Receive Data 

$0026 Host DMA Transmit Data 

$0028 Host Receive Data 

$002A Host Transmit Data 

$002C Host Command (default) 

$002E Available for Host Command 

$0030 Available for Host Command 

$0032 Available for Host Command 

$0034 Available for Host Command 

$0036 Available for Host Command 

$0038 Available for Host Command 

$003A Available for Host Command 

$003C Available for Host Command 

$003E Available for Host Command 


Interrupt service starts by fetching the instruction word in the first vector location and is 
considered finished when the fetch of the instruction word in the second vector location 
happens. In the case of an edge-triggered interrupt, the internal latch is automatically 
cleared when the second vector location is fetched. The fetch of the first vector location 
DOES NOT GUARANTEE that the second location will be fetched. Figure 7-6 illustrates 
one case where the second vector location is not fetched. In Figure 7-6 the SWI instruc- 
tion “discards” the fetch of the first interrupt vector to ensure that the SWI vectors will be 
fetched. Instruction n4 is decoded as a SWI while ii1 is being fetched. Execution of the 
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Int. Ctr cyc1 i i* 
Int. Ctr cyc2 i I" 

Fetch n3 nd iit ii3 ii4 sw1 sw2 sw3 sw4 
Decode n2 SWI -- -- -- JSR -- swt sw2 | sw3 
Execute n1 n3 SWI |NOP |NOP |NOP_ | JSR -- sw1 sw2 

Instruction 
decode Order| 1 2 3 4 5 6 7 


i = interrupt request 

i* = interrupt request generated by SWI 
ii1 = 1st vector of interrupt i 

ii3 = 1st SWI vector (1-word JSR) 

ii4 = 2nd SWI vector 

n = normal instruction word 

n4 = SWI 


sw = instructions pertaining to the SWI long interrupt routine 


Figure 7-6 Software Interrupt Mechanism 


SWI requires that ii1 be discarded and the two SWI instructions (ii3 and ii4) be fetched 
instead. 


CAUTION 


On all level sensitive interrupts, the interrupt must be externally released be- 
fore interrupts are internally re-enabled or the processor will be interrupted 
repeatedly until the interrupt is released. 


7.3.5.2 Software Interrupt Sources 


There are two software interrupt sources - Illegal Instruction Interrupt (III) and Software 
Interrupt (SWI). 


7.3.5.2.1 Illegal Instruction Interrupt 


Ill is a non-maskable interrupt (IPL 3) which is serviced immediately following the execu- 
tion of the ILLEGAL instruction or the attempted execution of an illegal instruction (any 
undefined operation code). Illegal instruction interrupts are fatal errors. Only a long inter- 
rupt routine should be used for the III routine. As shown in Figure 7-7, if a fast interrupt is 
chosen, everything being frozen after the decode of n5 (Il), this same instruction will be 
decoded again after execution of the two fast interrupt words. Execution will therefore 
loop forever between the illegal instruction and its fast interrupt routine. Even when a 
long interrupt is used, no RTI or RTS should be used at the end of the interrupt routine, 
since return from the illegal instruction interrupt to the main code will result in decoding 
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the illegal instruction again. During the illegal instruction interrupt service, the JSR 
located in the III vector will normally stack the address of the illegal instruction. The user 
may examine the stack (using MOVE SSH,dest) to locate the offending illegal instruc- 
tion. The ILLEGAL instruction is useful for triggering the illegal interrupt service to see if 
the III routine is capable of recovery from illegal instructions. 


There are two cases in which the stacked address will not point to the illegal instruction: 


1. If the illegal instruction is one of the two instructions at an interrupt vector loca- 
tion, and is fetched during a regular interrupt service, the processor will stack 
the address of the next sequential instruction in the normal instruction flow (the 
regular return address of the interrupt routine that had the illegal opcode in its 
vector). 


2. If the illegal instruction follows a REP instruction (see Figure 7-8), the DSP will 
effectively execute the illegal instruction as a repeated NOP, the interrupt vec- 
tor will then be inserted in the pipeline. The next instruction will be fetched but 
not decoded or executed. The processor will stack the address of the next se- 
quential instruction (i.e., n8 in Figure 7-8) which is two instructions after the il- 
legal instruction. 


In DO loops, if the illegal instruction is in the LA location, and the instruction preceding it 
(i.e. at LA-1) is being interrupted with a normal interrupt, the LC will be decremented as if 
the loop had reached the LA instruction. When the interrupt service ends and the instruc- 
tion flow returns to the loop, the illegal instruction will be refetched (since it is the next 
sequential instruction in the flow). The loop state machine will again decrement LC 
because the LA instruction is being executed. At this point, the illegal instruction will trig- 
ger the illegal instruction interrupt. Notice that the loop state machine decremented LC 
twice in one loop due to the presence of the illegal opcode at the LA location. This is a 
special condition that only happens during this situation. 


7.3.5.2.2 Software Interrupt 


SWI is a non-maskable interrupt (IPL 3) which is serviced immediately following the soft- 
ware interrupt instruction execution. A long interrupt service routine is usually used. The 
difference between a SWI and a JSR instruction is that the SWI sets the interrupt mask 
to prevent interrupts with an IPL below three from being serviced. Masking out lower 
level interrupts makes the SWI very useful for setting breakpoints in monitor programs. 
The JSR instruction does not affect the interrupt mask. 


7.3.5.3 Stack Error Interrupt 
The stack error interrupt is non-maskable (IPL 3). An overflow or underflow of the stack 
causes a Stack error interrupt (see Section 5 for additional information on the stack error 
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Int. Ctr cyct 
Int. Ctr cyc2 
Fetch n3 n4 n5 n6 - - ii ii2 nd 
Decode n2 n3 n4 lI -- -- -- iit ii2 lI 
Execute n1 n2 n3 n4 NOP -- -- -- iit ii2 NOP 
Instruction 
decode Order| 1 2 3 4 5 6 7 
i = interrupt request 
ii = interrupt instruction word i oe 
Il = Illegal Instruction P:$0004 it <> 
n = normal instruction word i2 
n3 
n4 
n5=Il > 
n6 
Figure 7-7 


Infinite Looping on Fast Illegal Instruction Interrupt Processing 


flag). The stack error interrupt is caused by a non-recoverable error condition and is vec- 
tored to P:$0002. Since the stack error is non-recoverable, a long interrupt should be 
used to service the interrupt and the service routine should not end in an RTI. Executing 
a RTI instruction “pops” the stack which has already been corrupted. 


7.3.6 Interrupt Priority Structure 

Four levels of interrupt priority are provided. Interrupt priority levels (IPLs) numbered 0, 
1, and 2, are maskable with level 0 as the lowest level. Level 3 (the highest level), is non- 
maskable. The only level 3 interrupts are Reset, Illegal Instruction, Stack Error and SWI. 
The interrupt mask bits (I1, 10) in the status register reflect the current processor priority 
level and indicate the interrupt priority level needed for an interrupt source to interrupt the 
processor (see Table 7-2). Interrupts are inhibited for all priority levels less than the cur- 
rent processor priority level. However, level 3 interrupts are not maskable and therefore 
can always interrupt the processor. 
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Int. Ctr cyc1 i i 
Int. Ctr cyc2 i 
Fetch n3 n4 nd n6 n7 2 - iit ii2 ng 
Decode n2 n3 n4 REP H — = = iit ii2 ng 
Execute ni n2 n3 n4 REP |NOP = -- = iit ii2 
Instruction 
decode Order| 1 2 3 4 5 6 7 8 


i = interrupt request 

ii = interrupt instruction word 
Il = Illegal Instruction 

n = normal instruction word 


Figure 7-8 Repeated Illegal Instruction 


Table 7-2 Status Register Interrupt Mask Bits 


Exceptions Exceptions 
Permitted Masked 


IPL 0,1,2,3 None 


IPL 1,2,3 IPL O 
IPL 2,3 IPL 0,1 
IPL 3 IPL 0,1,2, 


7.3.6.1 Interrupt Priority Levels (IPL) 


The interrupt priority level for each on-chip peripheral device and for each external inter- 
rupt source (IRQA, IRQB) can be programmed under software control. Each on-chip or 
external peripheral device can be programmed to one of the three maskable priority lev- 
els (IPL 0, 1, or 2). Interrupt priority levels are set by writing to the Interrupt Priority Reg- 
ister shown in Figure 7-9. This read/write register specifies the interrupt priority level for 
each of the interrupting devices (HOST, SSIs, Timer, IRQA, IRQB). In addition, this reg- 
ister specifies the trigger mode of both external interrupt sources and it is used to enable 
or disable the individual external interrupts. This register is cleared on RESET. Table 7-3 


defines the interrupt priority level bits. Table 7-4 defines the external interrupt trigger 
mode bits. 
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15 14 13 12 1110 9 8 7 6 5 4 3 2 1 40 


TL | TL |S1L/S1L|SOL|SOL}| HL} HL} * | * |IBL|IBL}IBL | IAL] IAL} IAL 
1/0; 141;7;0O0);71)0/;/1;),0);* | * }2)}1)}0)]2]1 70 


| L IRQA IPL 


TRQA mode 
IRQB IPL 
TROB mode 
Reserved 
HOST IPL 
SSIO IPL 
SSI1 IPL 
TM IPL 


“Read as zero and written with zero for future compatibility. 
Figure 7-9 Interrupt Priority Register IPR (Addr X:$FFDF) 


Table 7-3 Interrupt Priority Level Bits 
Enabled 


No 
Yes 
Yes 
Yes 


Trigger Mode 


Level 
Negative Edge 


7.3.6.2 Exception Priorities within an IPL 

If more than one exception is pending when an instruction is executed, the interrupt with 
the highest priority level is serviced first. When multiple interrupt requests with the same 
IPL are pending, a second fixed priority structure within that IPL determines which inter- 
rupt is serviced. The fixed priority of interrupts within an IPL and the interrupt enable bits 
for all interrupts are shown in Table 7-5 The interrupt enable bits for the HOST, SSls, 
and TM are located in the control registers associated with their respective on-chip 
peripherals. 


7.4 RESET STATE PROCESSING 
The reset processing state is entered in response to the external RESET pin being 
asserted (a hardware reset). Upon entering the reset state: 
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1. internal peripheral devices are reset, and their pins revert to general-purpose 
I/O pins. 


2. the modifier registers are set to $FFFF. 
3. the interrupt priority register is cleared. 


4. the BCR is set to $43FF, thereby inserting 31 wait states in all external memory 
accesses. 


5. the stack pointer is cleared. 


6. the loop flag, forever flag, scaling mode are cleared in the MR register, the in- 
terrupt mask bits are set, and all CCR bits are cleared. 


7. the OMR bits CD (Clockout Disable), SD (Stop delay), R (Rounding), SA (Sat- 
uration) are cleared. 


The DSP remains in the reset state until RESET is deasserted. Upon leaving the reset 
state: 


1. the chip operating mode bits of the OMR are loaded from the external mode se- 
lect pins (MODA, MODB, MOBC). 


2. program execution begins at program memory address $E000 in normal ex- 
panded mode or at $0000 in all other operation modes. The first instruction 
must be fetched and then decoded before executing. Therefore, the first in- 
struction is executed two instruction cycles after the first instruction fetch. Two 
NOPs are executed in the two instruction cycles before the first instruction is 
executed. 


The internal peripheral devices (HI, SSIO, SSI1, and ports A, B, and C) can be reset by 
several methods — hardware (HW) reset, software (SW) reset, individual (I) reset, and 
stop (ST) reset. Depending on the type of reset, the registers of these devices will be 
affected differently (see SECTIONS 8,9,10,11,12 for additional information on the inter- 
nal peripherals). 


7.5 WAIT STATE PROCESSING 

The wait processing state is a low power consumption state entered by execution of the 
WAIT instruction. In the wait state, the internal clock is disabled to all internal circuitry 
except the internal peripherals. All internal processing is halted until an unmasked inter- 
rupt occurs or the DSP is reset. The bus arbitration circuits (BR, BG, and BB pins) 
remain active during the Wait state if the DSP was in the slave mode (MC=0) before 
entering the WAIT state. The wait state is one of two low power states. 
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Figure 7-10 shows a WAIT instruction being fetched, decoded, and executed. It is 
fetched as n3 in this example and during decode is recognized as a WAIT instruction. 
The following instruction (n4) is aborted and the internal clock is disabled from all internal 
circuitry except the internal peripherals. The processor stays in this state until an inter- 
rupt or reset is recognized. The response time is variable due to the timing of the inter- 
rupt with respect to the internal clock. Figure 7-10 shows the result of a fast interrupt 
bringing the processor out of the wait state. The two appropriate interrupt vectors are 
fetched and put in the instruction pipe. The next instruction fetched is n4 which had been 
aborted earlier. Instruction execution proceeds normally from this point on. 


Figure 7-11 shows an example of the WAIT instruction being executed at the same time 
that an interrupt is pending. Instruction n4 is aborted as before. There is a five instruction 
cycle delay caused by the WAIT instruction and then the interrupt is processed normally. 
The internal clocks are not turned off and the net effect is that of executing eight NOP 
instructions between the execution of n2 and ii1. 
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Table 7-5 Exception Priorities within an IPL 


Priority Exception Enabled by | Control | Control 
Register) Register 
Bit No. |) Address 


Level 3 (Non-maskable) 


Highest Hardware RESET — — 


Illegal Instruction Interrupt 


Stack Error 


Lowest SWI 


Level 0, 1, 2 (Maskable) 


Highest IRQA (External Interrupt) IRQA X:$FFDF 
mode bits 


IRQB (External Interrupt) IRQB X:$FFDF 
mode bits 


Host Command Interrupt HCIE X:$FFC4 


Host/DMA RX Data Interrupt HRIE X:$FFC4 


Host/DMA TX Data Interrupt HTIE X:$FFC4 


SSI0 RX Data with RIE X:$FFD1 
Exception Status 


SSI0 RX Data RIE X:$FFD1 


SSI0 TX Data with TIE X:$FFD1 
Exception Status 


SSI0 TX Data TIE X:$FFD1 


SSI1 RX Data with RIE X:$FFD9 
Exception Status 


SSI1 RX Data RIE X:$FFD9 


SSI1 TX Data with TIE X:$FFD9 
Exception Status 


SSI1 TX Data X:$FFD9 


Timer Overflow Interrupt X:$FFEC 


Timer Compare Interrupt X:$FFEC 


MOTOROLA PROCESSING STATES 7-27 


For More Information On This Product 
Go to: www.freescale.com 


NP 


STOP STATE PROCESSING 


Int. Ctr cyc1 i 
Int. Ctr cyc2 i* 

Fetch n3 n4 - iit ii2 n4 n5 n6 
Decode n2 WAIT - iit ii2 n4 n5 
Execute ni n2 WAIT - iit ii2 n4 

Instruction 
decode Order| 1 2 3 4 5 6 


i = interrupt request 
ii = interrupt instruction word 
n = normal instruction word 


Figure 7-10 WAIT Instruction 


During the wait state, the BR/BG/BB circuits remain active if the DSP was in the slave 
mode. Before BR is asserted (see Table 7-6), all Port A signals are driven. The control 
signals are deasserted, the data signals are inputs and the address signals remain as 
the last address read or written. When BG is asserted, all signal are three-stated (high 
impedance). Immediately after BR is deasserted, the R/W, PS/DS, and TS signals are 
driven high — all other signals remain three-stated. During the first TO clock state follow- 
ing the exit from the wait state, control signals PS/DS, TS are again driven — the data 
and address signals remain three-stated. During first external access, all signals return 
to their normal operating mode. 


Table 7-6 BR/BG During WAIT (Slave Mode) 


Before BR | While BG After BR | After Return to After 1st 


Asserted Asserted |Deasserted | Normal State External Access 
from Wait State 


Output Hi-Z Hi-Z Output Output 


TS Output Hi-Z Hi-Z Output Output 


_ ; Output 
R/W Output Hi-Z (Read) Output Output 


Data /0 Hi-Z Hi-Z Hi-Z /O 


Address Output Hi-Z Hi-Z Hi-Z Output 


7.6 STOP STATE PROCESSING 


The stop processing state is the lowest power consumption state and is entered by the 
execution of the STOP instruction. In the stop state, all circuits are powered down except 
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Interrupt Synchronized and 


Recognized as Pending 
an 5 Instruction Cycle Delay ———> 


Int. Ctr cyc1 i 
Int. Ctr cyc2 i* 

Fetch n3 n4 - iit ii2 
Decode n2 WAIT - iit ii2 
Execute nt n2 | WAIT - iit 

Instruction 
decode Order| 1 2 3 4 5 


i = interrupt request 
ii = interrupt instruction word 
n = normal instruction word 


Figure 7-11 Simultaneous Wait Instruction and Interrupt 


for (1) the ED register, (2) the PLL when it is enabled, and (3) the CLKO circuitry when 
clockout is used. If the PLL and CLKO circuitry are not being used when the STOP 
instruction is executed, they will be powered down; however, the input buffer used to 
square EXTAL will still be active but will not dissipate power if the EXTAL pin is 
grounded. The chip clears all peripherals and external interrupts (IRQA, IRQB) when 
entering the stop state. Stack errors that were pending, remain pending. The priority lev- 
els of the peripherals remain as they were before the stop instruction was executed. The 
on-chip peripherals are held in their respective individual reset states while in the stop 
state. 


All activity in the processor is halted until one of the following actions occurs: 


1. A low level is applied to the IRQA pin. 
2. A low level is applied to the RESET pin. 


Either of these actions will gate on the oscillator and, after a clock stabilization delay, 
clocks to the processor and peripherals will be re-enabled. The clock stabilization delay 
period is determined by the stop delay (SD) bit in the OMR. 


The STOP sequence is composed of eight instruction cycles called STOP cycles. These 
are differentiated from normal instruction cycles because the fourth cycle is stretched an 
indeterminate period of time while the four phase clock is turned off. 
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IRQA 
N 

Fetch n3 n4 - n4 
Decode n2. |STOP 
Execute ni n2. |STOP 

STOP 

cycle count 1 2 3 4 5 6 7 8 (9) 
resume stop cycle count 4, in- 
Clock Stopped —! —terrupts enabled 
524KT or 28T cycle 


count started 
Figure 7-12 STOP Instruction Sequence 


The STOP instruction is fetched in STOP cycle 1 of Figure 7-12, decoded in STOP cycle 
2 (which is where it is first recognized as a stop command) and executed in STOP cycle 
3. The next instruction (n4) is fetched during STOP cycle 2 but is not decoded in STOP 
cycle 3 because, by that time the STOP instruction prevents the decode. The processor 
stops the clock and enters the stop mode. The processor will stay in the stop mode until 
it is restarted. 


Figure 7-13 shows the case of the IRQA signal being asserted to exit the stop state. If 
the exit from stop state was caused by a low level on the IRQA pin then the processor 


IRQA 
NN 
Fetch n3 n4 - n4 
Decode n2. |STOP 
Execute ni n2. |STOP 
STOP 
cycle count 1 2 3 4 5 6 7 8 (9) 
| resume stop cycle count 4, in- 
Clock Stopped —terrupts enabled 
524KT or 28T cycle 
count started 
Figure 7-13 STOP Instruction Sequence Followed by IRQA 
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will service the highest priority pending interrupt. If no interrupt is pending then the pro- 
cessor resumes at the instruction following the STOP instruction that caused the entry 
into the stop state. 


An IRQA deasserted before the end of the STOP cycle count will not be recognized as 
pending. If IRQA is asserted when the STOP cycle count completes, then an IRQA inter- 
rupt will be recognized as pending and arbitrated with any other interrupts if the IRQA 
was defined as level sensitive. 


Specifically, when IRQA is asserted, the internal clock generator is started and begins a 
delay determined by the SD bit of the OMR. If the internal clock oscillator is used, the SD 
bit should be set to 0 which enables a delay count of 524K T cycles (i.e., [2'9-4]T cycles) 
to allow the clock oscillator to stabilize. If a stable external clock is used, the SD bit may 
be set to 1 which enables a 28 T (i.e., [2°-4]T) cycle delay. 


The following description assumes that SD=0 (the 524K T counter is used). During the 
524K T count, interrupts are ignored until the last few count cycles. At this time, the inter- 
rupts are synchronized. At the end of the 524K T cycle delay period, the chip restarts 
instruction processing, the 4th stop cycle is completed (interrupt arbitration occurs at this 
time) and stop cycles 5,6,7, and 8 are executed (it takes 17T from the end of the 524K T 
delay to the first instruction fetch). If the IRQA signal is released (pulled high) after 4T 
minimum but less than 524K T cycles, no IRQA interrupt will occur and the instruction 
fetched after STOP cycle 8 will be the next sequential instruction (n4 in Figure 7-14). An 
IRQA interrupt will be serviced (as shown in Figure 7-13) if (1) the IRQA signal had previ- 
ously been initialized as level sensitive, (2) it is held low from the end of the 524K T cycle 
delay counter to the end of stop cycle count 8, and (3) no interrupt with a higher interrupt 
level is pending. If IRQA is not asserted during the last part of the STOP instruction 
sequence (6,7, and 8), and no interrupts are pending, the processor will refetch the next 
sequential instruction (n4). Since in Figure 7-13 the IRQA signal is asserted, the proces- 
sor will recognize the interrupt and then fetch and execute the instructions at P:$0008 
and P:$0009 which are the IRQA interrupt vector locations. 


To ensure servicing IRQA immediately after leaving the STOP state, the following steps 
must be taken before the execution of the STOP instruction: 


1. Define IRQA as level sensitive. 


2. Define IRQA priority as higher than the other sources and higher than the pro- 
gram priority. 


3. Ensure that no stack error is pending. 
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4. Execute the STOP instruction and enter the STOP state. 


5. Recover from the STOP state by asserting the IRQA pin and holding it asserted 
for the whole clock recovery time. If it is low, the IRQA vector will be fetched. 


6. The exact elapsed time for clock recovery is unpredictable, the external device 
that asserts IRQA must wait for some positive feedback, like a specific memory 
access or a change in some predetermined I/O pin, before deasserting IRQA. 


The STOP sequence totals 524K T cycles (i.e., [2'9-4]T cycles) if SD=0 or 28 T cycles (if 
SD=1) in addition to the period with no clocks from the STOP fetch to the IRQA vector 
fetch (or next instruction). However, there is an additional delay if the internal oscillator is 
used. An indeterminate period of time is needed for the oscillator to begin oscillating and 
then stabilize its amplitude. The processor will still count 524K T cycles but the period of 
the first oscillator cycles will be irregular so an additional period of approximately 20,000 
T should be allowed for this to happen. If an external oscillator is used and it is already 
stabilized, no additional time need be provided. 


If the STOP instruction is executed when the IRQA signal is asserted, the clock genera- 
tor will not be stopped, but the 4-phase clock will be disabled for the duration of the 524K 
T cycle (or 28 T cycle) delay count. This means that in this case the STOP looks like a 
524K + 32 T cycle (or 28T+ 32T cycle) NOP, since the STOP instruction itself is 8 
instruction cycles long (32 T). 


A stack error interrupt pending before entering the STOP state is not cleared and will 
remain pending. During the clock stabilization delay, all peripheral and external interrupts 
are cleared and ignored except stack error. If the on-chip peripherals have interrupts 
enabled in (1) their respective control registers and (2) in the interrupt priority register, 
then interrupts will be immediately pending after the clock recovery delay and will be ser- 
viced before continuing with the next instruction. If peripheral interrupts must be dis- 
abled, the user should disable them either with the control registers or with the interrupt 
priority register before the STOP instruction is executed. 


If the RESET pin had been used to restart the processor (see Figure 7-14), the 524K T 
cycle delay counter would not have been used, all pending interrupts would be dis- 
carded, and the processor would immediately enter the RESET processing state. The 
stabilization time required for the clock (RESET should be asserted for this time) is only 
50 T for a stabilized external clock but is the same 550,000 T for the internal oscillator. 
These stabilization times are recommended times and are not imposed by internal timers 
or time delays. The DSP fetches instructions immediately when it exits reset. If the user 
wishes to use the 524K T (or 28 T) delay counter, it can be started by asserting IRQA for 
a short time (about 2 clock cycles) to exit the stop state. 
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RESET 
lm 


Fetch n3 n4 n4 
Decode n2 STOP 
Execute n1 n2 STOP 
STOP 
cycle count 1 2 3 4 5 6 7 8 (9) 
processor leaves RESET state 
Clock Stopped 


enter RESET state 


Figure 7-14 STOP Instruction Sequence Recovering with RESET 


When in the stop state, the Port A bus is “frozen”. The state of each pin immediately 
before executing the STOP instruction will be held until the DSP leaves the stop state. 


Port A is not three-stated and the BR/BG/BB circuits are not operational. However, Port 
Awill remain three- stated if BG was asserted (in the slave mode) before the STOP com- 
mand was executed. One way to release the Port A bus for use while the DSP is in the 
STOP state is to use a Port B or Port C pin to assert BR (in the slave mode) before exe- 


cuting the STOP instruction. 
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INTRODUCTION 


8.1. INTRODUCTION 

DSP56100 family external bus timing is defined by the operation of the Address Bus, 
Data Bus, and Bus Control pins described in the User's Manual for each of the DSPs in 
the DSP56100 family. The external bus is designed to interface with a wide variety of 
memory and peripheral devices, from high speed static RAMs to slower memory 
devices. Figure 8-1 shows a static RAM design using 15 ns memories. 


MCM6209-15 
Program 
and 
data 

DSP56156 memory 
64K x 4 bits 


Figure 8-1 
Example of SRAM Connection to a 60 MHz DSP56156 Using One Wait-State 


External bus timing is controlled by the TA control signal and by the Bus Control Regis- 
ters (BCR). The BCR and TA control the bus interface signal timing. Wait state insertion 
is controlled by the BCR to provide fixed bus access timing, and by TA to provide 
dynamic bus access timing. The number of wait states is determined by the TA input or 
by the BCR, whichever is longer. 


8.2 SYNCHRONOUS BUS OPERATION 
A synchronous external bus cycle consists of at least 4 internal clock phases. Each syn- 
chronous external memory access requires the following procedure: 


1. The external memory address is defined by Address Bus AO-A15 and Memory 
Reference signal PS/DS. These signals change in the first phase of the exter- 
nal bus cycle. Memory Reference signal PS/DS has the same timing as the 
Address Bus and may be used as an additional address line. The Address sig- 
nals and PS/DS are also used to generate chip select for the appropriate 
memory chips. Chip select changes the memory devices from low power 
standby mode to active mode and begins the read access time. This allows 
slower memories to be used since the chip select signals are address based 
rather than read or write enable based. 
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2. When the Address lines and PS/DS are stable, data transfer is enabled by the 
Transfer Strobe TS signal. TS is asserted to qualify the Address signals and 
PS/DS as stable and to perform the read or write data transfer. TS is asserted 
in the second phase of the bus cycle. 


3. Wait states are inserted into the bus cycle controlled by a wait state counter or 
by TA, whichever is longer. The wait state counter is loaded from the BCR. If 
the wait state number determined by these two factors is zero, no wait state is 
inserted into the bus cycle and TS is deasserted in the fourth phase. If the wait 
state number determined is W, then W wait states are inserted into the instruc- 
tion cycle. Each wait state introduces one clock cycle delay (two phases 
each). TA is sampled by the DSP on every rising edge of T2. 


4. When Transfer Strobe TS is deasserted at the end of a bus cycle, the data is 
latched in the destination device. At the end of a read cycle, the DSP latches 
the data internally. At the end of a write cycle, the external memory latches the 
data. The Address signals remain stable until the first phase of the next exter- 
nal bus cycle to minimize power dissipation. The PS/DS signal is set high dur- 
ing periods of no bus activity and the data signals are three-stated. 


MCM6290-20 
16Kx16bits 
Synchronous 
RAM 


address 


CLK 


CLK* 


Figure 8-2 
MCM6290 16K x 16 Synchronous SRAM Used in 50 MHz 16-bit DSP System 
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Figure 8-2 shows an example of a 50 MHz 16-bit DSP connected to a 16K x 16-bit, 20 
ns, synchronous RAM. Note that the PS/DS control signal is used as an additional 
address line allowing a single external memory device to be used to store both program 
(8k words) and data (8k words) memory. 


8.3 BUS HANDSHAKE AND ARBITRATION 

Bus transactions are governed by a single bus master. Bus arbitration determines which 
device becomes the bus master. The arbitration logic implementation is system depen- 
dent, but must result in at most one device becoming the bus master (even if multiple 
devices request bus ownership) at any given time. 


8.3.1 Bus Arbitration signals 
Three signals are provided for bus arbitration. These signals are: 


BR Bus Request: Input in the slave mode: output in the master mode 
In the master mode, this output is asserted by the DSP requesting the bus to 


indicate that the DSP wants to use the bus. The output is held asserted until the 
DSP no longer needs the bus. This includes when the DSP is the bus master 
as well as when it is not actively using the bus but retains bus mastership. 


In the slave mode, this input is asserted by an external device to indicate to the 
DSP that the external device wants control of the external bus. In the slave 
mode, when BR is asserted, the DSP always relinquishes the bus. 


BG Bus Grant: Output in the slave mode: input in the master mode 
In the master mode, this input is asserted by the bus arbitration controller to sig- 


nal the DSP that the DSP is the bus master-elect. BG is valid only when the bus 
is not busy. The Bus Busy signal is described below. 


In the slave mode, this output pin is asserted by the DSP in response to a bus 
request BR. When BG is asserted, the DSP no longer drives the bus. 


BB Bus Busy: Output when bus master: input when not bus master 
This pin is asserted by the device (bus master) that received bus ownership 
from the bus arbitration controller. The master holds BB asserted for the dura- 
tion of its bus possession. When asserted, BB indicates that the DSP is driving 
the bus. BB deasserted indicates that the DSP is not driving the bus. BB may 
be used as a three-state enable control for external address, data and bus con- 
trol signal buffers. 
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The BB input is monitored by the DSP when it is the potential bus master (i.e., 
after BG has been asserted). The DSP will become bus master when BB is 
deasserted. 


Note: A DSP which is programmed as a bus master comes out of reset without pos- 
session of the bus. A DSP which is programmed as a bus slave comes out of 
reset with possession of the bus. 


8.3.2 Bus Arbitration between Two DSPs 

Figure 8-3 shows two DSPs sharing the same external bus. The three bus arbitration 
pins BR, BG, and BB allow for direct connection without external logic. The bus arbitra- 
tion is explained below. 


The two DSPs in Figure 8-3 share a common clock and common hardware reset cir- 
cuitry. DSP-1 leaves the reset state in the master mode (MC tied high) while DSP-2 
leaves the reset state in the slave mode (MC tied low). 


Figure 8-4, Figure 8-5, and Figure 8-6 show the bus arbitration between the two proces- 
sors. 


When DSP-1 needs the bus for an external access, BRm is asserted during TO. BGm is 
sampled by DSP-1 during the clock’s falling edge. When BGm is asserted by DSP-2, 


RESET 
CLK 
DSP-1 | DSP-2 
< > 
K >| 
BRm 5} BRs 
BGmke BGs 
BB lee > BBs 
data = 
Master address Slave 
Mode Mode 
control 


Shared 


External 
Memory 


Figure 8-3 Bus Arbitration Between Two 16-bit DSPs 
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DSP-1 starts sampling BB on the clock’s falling edge and starts a bus cycle on the 
clock’s first rising edge after BB is sampled and recognized. DSP-1 then assumes bus 
mastership by asserting BB. DSP-1 deasserts BRm when BGm has been received and 
the external bus is released. BRm is deasserted during TO. BB remains asserted as long 
as DSP-1 drives the bus. 


When DSP-2 receives a bus request on its BR input, it will three-state its AO-A15, DO- 
D15, TS, R/W, PS/DS pins at the earliest possible time while deasserting the BB pin. It 
then asserts BG and its BB pin becomes an input. When the BR input is deasserted, 
DSP-2 deasserts BG and DSP-2 regains bus control after sampling and recognizing BB 
as deasserted. 


When the master wishes to “park” on the bus (i.e., remain master even when it is not 
making external accesses) it can set the RH bit in the BCR. This causes BR to remain 
asserted until the RH bit is cleared. Bus parking is illustrated in Figure 8-5. 


8.3.3 Bus Arbitration between a DSP56156 and an MC68020 

Figure 8-7 shows a DSP in the master mode sharing the same external bus with an 
MC68020. The three bus arbitration pins BR, BG, and BB allow direct connection without 
external logic. The bus arbitration is explained below. 


After hardware RESET, the DSP is set in the master mode (MC is tied is to VCC). 


TO, 71,72 , Tw, T2; Tw, T2, Tw, T2 ; Tw ,T2 | Tw ,T2, T3, TO, T1 ,T2 ,T3, TO, T1, T2, T3, TO 
r \ r\ yy \ VY yy Y 
TO 71/72 |T3 >TO) T1 | T2| T3 | TO T2 |T3 TO) 71) T2|) T3 |TO|}T1) T2) T3) TO) T1) T2 
rN Of / / r\ \ of / -\ 
| \ AAA WE LG AG A KY 
Ss 


\ | slave samples BR slave recognizes BR 
r 


lave samples BR 


master recognizes BG 
ee grants the bus 


master samples BG 
e ® 


be master gets an the 
/ 3 ter samples \ 
BB hi bus 


high master | ___ 
recognizes BB high 


slave drives the bus master| drives the 
bus 


Figure 8-4 Master Requests and Gets the Bus for One Access 
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TO; 71,72 | T3, 70; 71; T2; T3 ; 70 ;T1 | T2 )T3 ; TO, 71) T2; T3 ,;TO | 71, T2; T3; TO; 11) T2 
\ | fy ¥ fr \ f r\Y VN FN FY FY Y 
\_/ \ A \ 4 | \_J A V4 WA NA LA RY 
T3 TO | T1 > T2) 73) TO} T1 | T2 T3 | TO T2 |) 73/ TO) T1 |T2 /T3 > TO, T1) 72) T3) TO 
/ a a { \ | ae 
/ / | \ J / i WA A \_] 
slave samples slave recognizes 
BR BR master sampl i 
ma: ples master recognizes 
BG BG 


slave deasserts BG slave samples slave recognizes 
BB high BB high 


/ waste gives ¢ aig gets on the 
up bus bus 


slave drives the bus 


Figure 8-5 Slave Gets the Bus Back After One Master Access 


To, T1,T2 73) T0, 71, 12,73) TO T1 T1273 TO) T1, 12; T3 TO, T1, T2; 73,70, TH 
{—\ i \ y [\ / \ [_\ r_\ / 
to|t1|T2}73|to/71 | T2} 73 | To |t1 |t2|T3 |to| T1| T2| T3 |To | 11} T2| T3]} To! 11 
ry of / / —\ Vf V1 FY 
J \ J 1 iy I \ J A\G LAN AG MG 


RH set by slave |recognizes BR 
® Master lave samples BR Master asserts BR even if no access 
Slave grants a 
y the bus master samples BG 
¢ 


e — master gets on the 
master samples BB) high Sus 

Gaia drives the bus master | drives the 
bus 


This pattern repeats each 
time the master |jaccesses 
the bus while RH=1;| BB 
will stay asserted as long 
as DSP owns the bus. 


> 


Figure 8-6 Bus Parking by the Master 
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asserted by the MC68020, the DSP starts a bus cycle after sampling BB and BB is deas- 
serted. The DSP assumes bus mastership by asserting BB and then deasserts BR if it 
only wants the bus for one cycle. BR remains asserted for a series of consecutive exter- 
nal accesses or when the bus request hold bit (RH) of the BCR register is set. BB 
remains asserted as long as the DSP drives the bus and as long as BG remains 
asserted. When BG is deasserted, BB is deasserted at the end of the last external bus 
access. 


When the MC68020 receives a bus request on its BR input, it will assert BG at the earli- 
est possible time. BG will not be asserted until the end of a read-modify-write operation. 
BG will be deasserted by the MC68020 when the new bus master has asserted BGACK. 


8.3.4 Bus Arbitration with External Bus Arbitrator 

Systems that include several devices that can become bus master require external cir- 
cuitry to assign priorities to the devices. This circuitry allows only the device with the 
highest priority to become bus master when two or more devices attempt to become bus 
master simultaneously. Figure 8-8 shows an example of bus arbitration with several 
DSPs and other CPUs. 


Bus arbitration is handled by a central bus arbitrator, using individual request/grant lines 
to each potential bus master. The arbitration protocol can operate in parallel with bus 
transfer activity allowing fast bus acquisition. The arbitration sequence occurs as follows: 


1. All candidates for bus ownership assert their respective BR signals as soon as 


DSP56156 MC68020 
BR | BR 
BG k BG 
BB k + BGACK 
re 

16-bit — 

DSP address 

Master control 


Shared 
External 
Memory 


Figure 8-7 Bus Arbitration Between a DSP56156 and an MC68020 
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they need the bus. 


2. The arbitration logic designates a bus master-elect by asserting the BG signal 
for that device. 


3. The master-elect tests BB to insure that the previous master has relinquished 
the bus. If BB is deasserted, then the master-elect takes control of the bus. If a 
higher priority bus request occurs before the BB signal was deasserted, then 
the arbitration logic may replace the current master-elect with the higher prior- 
ity candidate (Figures 15-8 and 15-9 show the arbitration timing). However, 
only one BG signal is allowed be asserted at any one time. 


4. The new bus master begins its bus transfers after BB is asserted. 


5. At anytime, the arbitration logic can signal the current bus master to relinquish 
the bus by deasserting BG. A DSP56156 bus master releases its ownership 
(deasserts BB) after completing the current external bus access and after rec- 
ognizing BG is deasserted. If BG is not deasserted, the DSP56156 bus master 
does not deassert BR, remains bus master, and continues to assert BB. If an 
instruction is executing a Read-Modify-Write external access, the DSP will 
only relinquish the bus after completing the whole Read-Modify-Write 
sequence. 


The DSP56156 has one control bit (RH) to permit software control of the BR and one 
status bit (BS) to verify whether it owns the bus mastership. If the RH bit in the BCR reg- 
ister is set, the DSP holds its BR signal asserted as long as requests for bus transfers 


BUS ARBITER 
BR1  BGt1 BR2 BG2 BRn BGn 
DSP56156 #1 DSP56156 #2 CPU 
BB1 BB2 BBn 


Figure 8-8 Bus Arbitration Between Several 16-bit DSPs and Other Processors 
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are pending. As long as the RH bit is set, BR will remain asserted. This situation is called 
“bus parking” and allows the current bus master to use the bus repeatedly without re- 
arbitration. 
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| INTRODUCTION 


9.1 INTRODUCTION 


The DSP56100 Family does not contain an on-chip oscillator. An external system clock 
must be provided through the EXTAL input pin. The on-chip phase locked loop (PLL) can 
be used to generate the DSP5616 core system clock or it can be bypassed allowing the 
DSP5616 core to directly use the clock provided on the EXTAL pin. 


Figure 9-1 shows the general block diagram of the on-chip frequency synthesizer. 


The 4-bit divider ID3-IDO defines the resolution of the PLL and divides the incoming clock 
rate fed to the PLL. The eight down counter bits YD7-YDO control down counting in the 
PLL feedback loop causing it to divide by the value YD+1 (any number between 1 and 
256) which effectively multiplies the frequency out of the PLL. The VCO output can be di- 
vided down by any power of 2 between 2° and 2'° before entering the core using the 4- 
bits PD3-PD0 of the control register PCR1. The system frequency on the DSP core is con- 
trolled by the frequency control bits of the PLL control register PCRO as follows: 


Fosc = {Fext+(ID+1]}x[YD+1]+ (2P?) 
where ID is the value contained in ID3-IDO, YD is the value contained in YD7-YDO, and 


PD is the value contained in PD3-PDO. Fext is a squared and delayed version of the clock 
signal applied to the EXTAL input pin. 


Note: The STOP instruction does not power down the PLL if the PLL is enabled 
(PLLD=0) when entering the STOP mode. STOP will power down the ID register if 
the PLL is disabled (PLLD=1) when entering the STOP mode. (see Section 9.3.4). 
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8-bit PLL Down Counter 


PS=0 ) PS=1 


e Internal Phase PHO 


On-chip Frequency Synthesis Control/Status Registers 


15 14 13 12 11 10 8 

LOCK| PLLE| PLLD| PS | CS1 | CSO READ-WRITE 
PLL CONTROL 
7 6 5 4 3 2 REGISTER (PCR1) 
“ . : * ADDRESS $FFDC 


15 14 13 12 
PD3 | PD2 |} PD1 | PDO READ-WRITE 
PLL CONTROL 
7 6 5 4 REGISTER (PCRO) 

YD7 | YD6 | YDS | YD4 ADDRESS $FFDB 


*: Reserved bits 


Figure 9-1 DSP56100 Family Frequency Synthesizer 
Block Diagram and Control Registers 


9.2 ON-CHIP CLOCK SYNTHESIS CONTROL REGISTER PCRO 


The Clock Synthesis Control Register PCRO is a 16-bit read/write register used to direct 
the operation of the on-chip clock synthesis. The PCRO controls the frequency program- 
ming of the PLL. The PCRO control bits are described in the following sections. 


All PCRO bits of are cleared by DSP hardware. Software reset does not affect this register. 


9.2.1 PCRO Feedback Divider Bits (YD7-YDO) Bits 0-7 

The eight feedback divider bits YD7-YDO control the down counter in the feedback loop, 
causing it to divide by the value YD+1 where YD is the value contained in the eight bits. 
Changing these bits requires a time delay for the Voltage Controlled Oscillator (VCO) to 
lock again. 
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The LOCK bit is cleared any time a new value is written to the YD bits. 


The resulting DSP core system clock must be within the limits specified by the technical 
data sheet. The frequency of the VCO should also remain higher than the minimum value 
specified in this data sheet. 


9.2.2 PCRO Input Divider Bits (ID3-IDO) Bits 8-11 

The four input divider bits are used to divide the input clock frequency by any number be- 
tween 1 and 16. The output of the divider is used as input for the phase comparator of the 
PLL. If ID is the value contained in the four bits, the input clock to the PLL is divided by 
ID+1. 


Any time a new value is written to the ID bits, the LOCK bit is cleared. 


9.2.3  PCRO Power Divider Bits (PD3-PDO) Bits 12-15 

The four power divider bits are used to divide the VCO output clock frequency by any pow- 
er of two between 2° and 2'° (i.e., 1, 2, 4, 8, 16, 32, ..., 16384, or 32768). The output of 
the divider can be used as the operating clock for the DSP core, as shown in Figure 9-1. 
Writing to the PD bits does not affect the LOCK condition of the PLL. 


The PD bits can be used to switch the DSP core back and forth from a high MIPS rate to 
a very low speed, low power mode without having to wait and check for the PLL to lock 
on a new frequency. 


9.3 ON-CHIP CLOCK SYNTHESIS CONTROL REGISTER PCR1 
The Clock Synthesis Control Register PCR1 is a 16-bit read/write register used to direct 


the operation of the on-chip clock synthesizer. The PCR1 control bits are described in the 
following sections. 


All PCR1 bits are cleared by DSP hardware. Software reset does not affect this register. 


9.3.1 PCR1 Reserved Bits — Bits 0-9 
These bits are reserved and should be written as zero by the user. 


9.3.2 PCR1 CLKO Select Bits (CS1-CSO) Bits 10 and 11 


The two CLKO Select bits CS1-CSO enable one of three possible clocks to be output to 
the CLKO pin when the CD bit in the OMR register is cleared (see Figure 9-1). After hard- 
ware reset, the internal DSP core clock PHO (phase zero) is output to the CLKO pin. PHO 
is a delayed version of the DSP core master clock, Fosc. Changing the value of the two 
bits CS1-CS0 according to Table 9-1, Fext or Fext/2 can be selected to be output on CL- 
KO. Fext is a squared and delayed version of the signal applied to the EXTAL input pin. 
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Table 9-1 CLKOUT Pin Control 


CS1 CSO CLKO 
0 0 PHO 
0 1 Reserved 
1 0 Fext 
1 1 Fext/2 


9.3.3. PCR1 Phase Select Bit (PS) Bit 12 

This bit is used to select the DSP core clock when the PLL output is not selected 
(PLLE=0). When this bit is cleared, a squared version of EXTAL is selected as Fosc. 
When this bit is set, the output of the ID divider is selected as Fosc. 


9.3.4 PCR1 PLL Power Down Bit (PLLD) Bit 13 
When the PLLD bit is set, the on-chip PLL is powered down. When this control bit is 
cleared, the on-chip PLL is turned on. This bit should not be set when the PLLE bit is set. 


If the PLL has to be turned off before entering the STOP mode, the following sequence 
will have to be executed before the STOP instruction: 


- Clear the PLLE bit (switch back to EXTAL) 

- Set the PLLD bit (power down the PLL) 

- Execute the STOP instruction. 
Setting the PLLD bit clears the LOCK bit. Setting the PLLD bit powers down the complete 
PLL block including the PD and YD registers. 


9.3.5 PCR1 PLL Enable Bit (PLLE) Bit 14 

When the PLLE bit is set, the DSP5616 core system clock is generated by the on-chip 
PLL. Table 9-2 summarizes the function of the three bits — PLLE, PLLD and PS. The 
state of the PLL is defined by the PLLD bit. When the PLLD bit is set, the PLL is in the 
power down mode. When the PLLD bit is cleared, the PLL is in the active mode. Before 
turning the PLL off, the PLLE bit should be cleared in order to by-pass the PLL. The PLL 
can then be put in power down mode by setting PLLD. 


If the output frequency of the PLL has to be changed by re-programming the YD bits while 
the PLL output is used by the core (PLLE=1; PLLD=0), the following sequence of opera- 
tions should be performed: 


- Clear the PLLE bit to switch back to EXTAL 
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- Program the YD bits (only after clearing PLLE) 
- Wait for the LOCK bit to be set 
- Set PLLE after the LOCK bit is tested high. 


Table 9-2 PLL Operations 


PLLE | PLLD PS Fosc PLL Mode 
0 0 0 Fext Active 
0 1 0 Fext Power Down 
0 0 1 Fext+[ID+1] Active 
0 1 1 Fext+[ID+1] Power Down 
1 0 x {Fext={ID+1]}x[YD+1}; (2°?) Active 
1 1 X Reserved = 


9.3.6 PCR1 Voltage Controlled Oscillator Lock Bit (LOCK) Bit 15 

This status bit shows whether the Voltage Controlled Oscillator (VCO) has locked on the 
desired frequency or not. When the LOCK bit is set, the VCO has locked; when the LOCK 
bit is cleared, the VCO has not locked yet. This bit is cleared when setting the PLLD bit 
and when changing the value of ID or YD bits. The LOCK bit is not cleared when clearing 
the PLLE bit without changing the values of PLLD, YD, or ID. 


This bit is read-only and cannot be written by the DSP core. 
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ID3-IDO 


PD3-PDO PLLE=1 
el Filter| we{ vco| - 20 tg +215 ee @ Fosc 


4-bit Power of two Divider ) ee 
YD7-YDO 


= 1 to +256 ieee 
8-bit PLL Down Counter 

~ > 

Internal Phase PHO 


On-chip Frequency Synthesis Control/Status Register (PCR1) ADDRESS X:$FFDC 


15. (14) 19) AO | 10 |) Oe Fe el 6 le 8 es ao 
LOCK[PLLE|PLLD| PS|CS1 CSO/| = |» + roe 


LOCK PLL unlocked 
PLL locked 
PLLE PLLD PLL active but not used as Fosc 
PLL powered down 
PLL active and used as Fosc 
Reserved 
PHASE Squared EXTAL selected as Fosc if PLLE=0 
SELECT Squared EXTAL/ID selected as Fosc if PLLE=0 
CS1-CSO 00 PHO output to CLKO when enabled by the CD bit (bit 7) of the OMR 
CLKO 01 reserved 
Select 10 Fext output to CLKO when enabled by the CD bit (bit 7) of the OMR 
11 Fext/2 output to CLKO when enabled by the CD bit (bit 7) of the OMR 


On-chip Frequency Synthesis Control/Status Register (PCRO) ADDRESS X:$FFDB 
15 14 13 #12/11/| 10/9 |8 |7 +6 5 4| 383 2 1 0 
PD3 PD2 PD1 PDO|ID3 ID2 ID1i IDO |YD7 YD6 YD5 YDO| YD3 YD2 YD1 YDO 


PD3-PDO | $0 Divide the VCO output clock by 1 (2°) 
Clock $1 Divide the VCO output clock by 2 (2') 
(2°) 

) 


Divide the VCO output clock by 256 (2°) 
Divide the VCO output clock by 512 (29) 
Divide the VCO output clock by 1024 (210 
Divide the VCO output clock by 2048 (2"! 
Divide the VCO output clock by 4096 (212 
Divide the VCO output clock by 8192 (219 
Divide the VCO output clock by 64 (2°) Divide the VCO output clock by 16384 (214) 
Divide the VCO output clock by 128 (2”) Divide the VCO output clock by 32768 (2'°) 
ID3-IDO Divide the input clock by 14 Divide the input clock by 9 

Input ivi i Divide the in lock by 14 

Clock ivi i Divide the input clock by 114 

Divider Divide the input clock by 4 Divide the input clock by 12 
Divide the input clock Divide the input clock by 1 
Divide the input clock by 6 Divide the input clock by 14 
Divide the input clock by 7 Divide the input clock by 15 
Divide the input clock by 8 Divide the input clock by 16 


Output $2 Divide the VCO output clock by 4 2 
Divider $3 Divide the VCO output clock by 8 (2° 
Divide the VCO output clock by 16 (24) 
Divide the VCO output clock by 32 (25) 


) 
) 
) 
) 


poim jm |O |Q]|W |x} © | oc 


Oo 


mm |jO 


YD7-YDO 
VCO 
Down Multiplies by YD+1 

Counter 
value 


Figure 9-2 On-Chip Frequency Synthesizer Programming Model Summary. 
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| INTRODUCTION 


10.1. INTRODUCTION 


The purpose of this Section is to describe a set of circuits which will be used for hardware/ 
software emulation and debug on the DSP56100 family. OnCE provides a means of inter- 
acting with the DSP and any memory mapped peripherals non-intrusively so that a user 
may examine registers, memory or on-chip peripherals. To achieve this, special circuits 
and dedicated pins on the DSP are used to avoid sacrificing any user accessible on-chip 
resource. A key feature of the special OnCE pins is to allow the user to insert the DSP into 
his target system yet retaining debug control, especially in the cases of devices specified 
without external bus. The need for a costly cable which brings out the footprint of any chip 
on traditional emulator systems is eliminated. 


Figure 10-1 illustrates a block diagram of the Emulation and test serial interface. 


10.2 EMULATION AND TEST PINOUT 


10.2.1 Debug Serial Input/OnCE Status 0 (DSI/OSO) 


The DSI/OSO pin, when input, is the pin through which serial data or commands are pro- 
vided to the OnCE controller. The data received on the DSI pin is recognized only when 
the DSP has entered the debug mode of operation. Data is always shifted into the OnCE 
serial port most significant bit (MSB) first on the falling edge of the OnCE serial clock, 
DSCK. When an output, this pin in conjuction with the OS1 pin, provides information about 
the chip status when debug mode cannot be entered in response to an external request. 
The DSI/OSO pin is an output when not in Debug Mode (i.e., until the acknowledge signal 
is issued to the Command Controller). When switching from output to input, the pin is 
three-stated. In order to avoid any possible glitches, an external pull-down resistor should 
be attached to this pin. During hardware reset, this pin is defined as an output and it is 
driven low. 


10.2.2 Debug Serial Clock/OnCE Status 1 (DSCK/OS1) 


The DSCK/OS1 pin, when an input, is the pin through which the serial clock is supplied to 
the OnCE controller. The serial clock provides pulses required to shift data into and out of 
the OnCE serial port. Data is shifted into the chip via the DSI pin on the falling edge of 
DSCK and is shifted out of the chip via the DSO pin on the rising edge of DSCK. When 
an output, this pin, in conjunction with the OSO pin, provides information about the chip 
status when debug mode cannot be entered in response to an external request. The 
DSCK/OS1 pin is an output when not in Debug Mode (until the acknowledge signal is is- 
sued to the Command Controller). When switching from output to input, the pin is first 
three-stated. In order to avoid any possible glitches, an external pull-down resistor should 
be attached to this pin. During hardware reset, this pin is defined as output and it is driven 
low. 
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Note: PILB = Program Instruction Latch Bus 


Figure 10-1 OnCE Block Diagram 
Table 10-1 shows the status of the chip as a tunction ot the two output pins OS0:0S1. 


Table 10-1 Function of OS1:0S0 
Status 


Normal state 


STOP or WAIT mode 
DSP busy state (external accesses with wait state) 
reserved 


10.2.3. Debug Serial Output (DSO) 


The DSO pin, while in debug mode, is the serial output that permits reading the data con- 
tained in one of the OnCE controller registers as specified by the last command received 
from the external command controller. Data is shifted out of the chip via the DSO pin on 
the rising edge of DSCK. An acknowledgment pulse will be sent on the DSO pin when: 


1. the chip enters the OnCE mode (external, DR, hardware breakpoint, software 
breakpoint or trace) to indicate that the chip is ready to accept OnCE com- 
mands. This pulse is 3T long. 


2. a “do nothing” operation (no go, no exit) is selected to indicate that the input 
command register is ready to receive a new command. This pulse is 4T long. 


3. the requested data (before a read) is available to indicate that the serial shift 
registers are ready to receive clocks to start transmitting data to the DSO pin. 
This pulse is 4T long. 
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4. the shift registers are ready to receive clocks to receive data (before a write) 
from the DSI pin. This pulse is 4T long. 


5. the shift registers have finished shifting in the new data (after a write) to indi- 
cate that the input command register is now ready to receive new instruction. 
This pulse is 4T long. 


6. an instruction has completed execution (go, no exit; repeat an instruction). 
This pulse is 4T long. 


Data is always shifted out the OnCE serial port most significant bit (MSB) first on the rising 
edge of DSCK. When not in debug mode, the DSO pin is driven high. During hardware 
reset this pin is driven high. 


10.2.4 Debug Request Input (DR) 


The DR input is an active low pin that provides a means of entering the debug mode of 
operation from the external command controller. This pin, when asserted, will cause the 
DSP to finish the current instruction being executed, save the instruction pipeline informa- 
tion, enter the debug mode and wait for commands to be entered from the debug serial 
input line. 


10.3. ONCE CONTROLLER AND SERIAL INTERFACE 


The OnCE Controller and Serial Interface contains the following blocks: input shift regis- 
ter, bit counter, OnCE decoder and the status/control register. Figure 10-2 illustrates a 
block diagram of the OnCE serial interface. 


10.3.1 OnCE Input Shift Register (OISR) 


The OISR is an 8-bit shift register that receives the serial data from the DSI line. The data 
is clocked into the register on the falling edge of the clock applied to the DSCK pin. After 
the 8th bit is received the OISR will stop shifting in new data. The latched data will be used 
as input for the OnCE Decoder. The data is always shifted into the OISR most significant 
bit (MSB) first. 


10.3.2 | OnCE Bit Counter (OBC) 


The OBC is a 4-bit counter (0...15) associated with shifting in and out the data bits. The 
OBC is incremented by the falling edges of the DSCK. The OBC is cleared at reset and 
whenever the DSP acknowledges that the Debug Mode has been entered. The OBC sup- 
plies two signals to the OnCE Decoder: one indicating that the first 8 bits were shifted-in 
(SO a new command is available) and the second indicating that 16 bits were shifted-in 
(the data associated with that command is available) or that 16 bits were shifted-out (the 
data required by a read command was shifted out). 
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Figure 10-2 OnCE Controller and Serial Interface 


10.3.3 OnCE Decoder (ODEC) 


The ODEC is the supervisor of the entire OnCE activity. It receives as input the 8-bit com- 
mand from the OISR, two signals from OBC (one indicating that 8 bits have been received 
and the other that 16 bits have been received), and one signal indicating that the DSP has 
halted. The ODEC generates all the strobes required for reading and writing the selected 
OnCE registers. 


10.3.4 |OnCE Status and Control Register (OSCR) 


The (OSCR is a 16-bit register used to select the events that will put the chip in Debug 
Mode. Breakpoints may be disabled or enabled on one memory space. The Trace Mode 
of operation is also selected through OSCR. 


OSCR is shown in Table 10-2 and the control bits are described in the following para- 
graphs. 
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Table 10-2 OnCE Status and Control Register (OSCR) 


Status Control 


12 11 4 3 2 1 


TME Bs BSO pe 


10.3.4.1 OSCR Breakpoint Enables (BEO-BE1) Bit 0-1 


These control bits enable or disable the breakpoint logic and select the type of memory 
operations (read; write; access) upon which the breakpoint logic operates. These bits are 
cleared on hardware reset. 


Selection 


Breakpoint disabled 

Breakpoint enabled on memory write 
Breakpoint enabled on memory read 
Breakpoint enabled on memory access 


10.3.4.2 OSCR Breakpoint Selection (BSO-BS1) Bits 2-3 


These control bits select if the Breakpoints will be recognized on program memory fetch, 
program memory access, X memory access or second X memory read. These bits are 
cleared on hardware reset. 


Selection 


Breakpoint on program memory fetch (fetch of the first word of instructions which 
are actually executed; not of those which are killed, not of those which are the sec- 
ond word of two-word instructions, and not of jumps which are not taken) 


Breakpoint on any program memory access (any MOVEM instructions, fetches of 
instructions which are executed and of instructions which are killed, fetches of sec- 
ond word of two-word instructions, and fetches of jumps which are not taken 


Breakpoint on first X memory (xab1) access 


Breakpoint on second X memory (xab2) read 
(xab2 cannot be used to write data into the X memory) 


MOTOROLA ON-CHIP EMULATION (OnCE) 10-7 


For More Information On This Product 
Go to: www.freescale.com 


| ONCE CONTROLLER AND SERIAL INTERFACE 


The decoding scheme for BS(1:0) and BE(1:0) is as follows: 


Function BS(1:0) BE(1:0) 


disable 00 


program fetch 01 
program fetch 10 
program fetch 11 


any program write or fetch 
any program read or fetch 
any program access or fetch 


XAB1 write 
XAB1 read 
XAB1 access 


disable 
XAB2 
XAB2 


10.3.4.3. OSCR Trace Mode Enable (TME) Bit 4 


This control bit, when set, enables the Trace Mode. When the Trace Mode is enabled, the 
chip will enter the Debug Mode whenever the execution of an instruction is completed and 
the Trace Counter is zero. This bit is cleared on hardware reset. 


10.3.4.4 OSCR (Reserved) Bits 5-7 

These bits are reserved for future use and read as zero. Reserved bits should be written 
as zero for future compatibility. 

10.3.4.5 OSCR Software Breakpoint Occurrence (SBO) Bit 8 


This read-only status bit is set when the debug mode has been entered by a DEBUG or 
DEBUGcc instruction. It is used by the external command controller to determine how the 
debug mode was entered. This bit is cleared when leaving the debug mode and is also 
cleared on hardware reset. 


10.3.4.6 OSCR Hardware Breakpoint Occurrence (HBO) Bit 9 


This read-only status bit is set when a OnCE hardware breakpoint occurs. It is used by 
the external command controller to determine how the debug mode was entered. This bit 
is cleared when leaving the debug mode and it is also cleared on hardware reset. 
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10.3.4.7. OSCR Trace Occurrence (TO) Bit 10 


This read-only status bit is set when the debug mode of operation is entered from a dec- 
rement to zero of the trace counter and the trace mode has been armed. This bit is cleared 
on reset and when leaving the debug mode. 


10.3.4.8 OSCR Reserved — Bits 11-15 


These bits are reserved for future use and read as zero. Reserved bits should be written 
as zero for future compatibility. 


10.4 OnCE BREAKPOINT LOGIC 


Other processors traditionally set a breakpoint in program memory by replacing the in- 
struction at the breakpoint address with an illegal instruction which causes a breakpoint 
exception. This technique is limiting in that breakpoints can only be set in RAM at the be- 
ginning of an opcode and not on an operand. Using such techniques, breakpoints can 
never be set in data memory. 


On the other hand, by using address comparators, breakpoints may be set on program 
memory opcodes or any data memory location. This significantly increases the program- 
mer’s ability to monitor what the program is doing real-time. 


The breakpoint logic can be enabled for Program memory breakpoints or for Data memory 
breakpoints. It contains an address latch, a register that stores the breakpoint address, a 
comparator and a counter. Figure 10-3 illustrates a block diagram of the OnCE Breakpoint 
Logic. 


10.4.1 OnCE Breakpoint Logic Operation 


The address comparator register is useful in halting a program at a specific point to ex- 
amine/change registers or memory. Using the address comparator to set breakpoints en- 
ables the user to set breakpoints in RAM or ROM while in any operating mode. 


The address comparator will cause a logic true signal when the comparison of its value is 
equal to the address on the bus. The breakpoint counter is then decremented if greater 
than zero. If the breakpoint counter is equal to zero, it is not decremented and a break- 
point occurs. 


Conditional jump addresses produced by the instruction pipeline that are equal to the pro- 
gram address being monitored are only valid if the conditional jump instruction occurs, 
otherwise the conditional jump address is ignored. Program memory address breakpoints 
occur after the opcode or operand is executed and the breakpoint counter has been dec- 
remented to zero. 
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Figure 10-3 Breakpoint Logic 


Data memory address breakpoints also occur after the execution of the instruction which 
formed the data memory address and the breakpoint counter has decremented to zero. 
The breakpoint registers are controlled by the debug status and control register (OSCR). 


10.4.2  Breakpoint Counter 


The breakpoint counter is a 16-bit counter that is useful for stopping at the nth iteration of 
a program loop or when the nth occurrence of a data memory access occurs. This infor- 
mation significantly decreases algorithm debug and provides a means of checking hot 
spots in program segments as well as peripheral or data memory accesses. 


The breakpoint counter becomes a powerful tool when debugging real-time fast interrupt 
sequences such as servicing an A/D or D/A convertor or stopping after a specific number 
of host transfers have occurred. The breakpoint counter is cleared by reset. 


10.4.3 ©OnCE Memory Address Latch (OMAL) 


The Memory Address Latch (OMAL) is a 16-bit register that latches the PAB, XAB1, or 
XAB2 on every cycle. 
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10.4.4 Memory Breakpoint Address Register (OMBAR) 


The Memory Breakpoint Address Register (OMBAR) is a 16-bit register that stores the 
memory breakpoint address. OMBAR is available for read/write operations only through 
the OnCE serial interface. Before enabling breakpoints, OMBAR must be loaded by the 
command controller. 


10.4.5 | Memory Address Comparator (OMAC) 


The Memory Address Comparator (OMAC) is a 16-bit comparator that compares the cur- 
rent memory address (stored by OMAL) with Memory Address Register (OMBAR). If 
OMAC is equal to OMAL then the comparator delivers a signal indicating that the break- 
point address has been reached. 


10.4.6 Memory Breakpoint Counter (OMBC) 


The Program Memory Breakpoint Counter (OMBC) is a 16-bit counter which is loaded 
with a value equal to the number of times minus one that a program or data memory ad- 
dress should occur before a breakpoint is acknowledged. On each occurrence the counter 
is decremented. When the counter has reached the value of zero and a new occurrence 
takes place, a signal is generated and, if breakpoints are enabled in OSCR, the chip will 
enter the Debug Mode. OMBC is available for read/write operations only through the 
OnCE serial interface. Before enabling Memory Breakpoints, OMBC must be loaded by 
the command controller. 


10.5 TRACE/STEP MODE 


When in the special trace mode, the DSP will not cause an interrupt exception but instead 
will enter the debug operation mode and wait for further instructions from the debug serial 
port. Single or multiple instructions can be traced. 


10.5.1 Trace Counter 


The trace mode has a 16-bit counter associated with it so that more than one instruction 
may be executed before returning back to the debug mode of operation. The objective of 
the counter is to allow the user to take multiple instruction steps in real-time with no inter- 
ference from the debug mode. This feature helps the software developer debug sections 
of code which do not have a normal flow or are getting hung up in infinite loops. The trace 
counter also enables the user to debug areas of code which are time critical. 


To enable the trace mode of operation the counter is loaded with a value, the program 
counter is set to the start location of the instruction(s) to be executed real-time, the trace 
mode is selected in the debug status register (OSCR) and the DSP exits the debug mode 
by executing the appropriate command issued by the external command controller. Upon 
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exiting the debug mode the counter is decremented after each execution of an instruction. 
Interrupts are serviceable and all instructions executed including fast interrupt services 
will decrement the trace counter. Upon decrementing to zero the DSP will re-enter the de- 
bug mode, the trace occurrence bit in the debug status/control register (OSCR) will be set 
and the debug serial output pin DSO will be toggled to indicate that the DSP OnCE port 
is requesting service. 


Note: The trace count should be loaded with one less than (i.e., N-1) the number of in- 
structions that the user wants to execute (e.g., to single step one instruction, the 
trace counter is loaded with a zero). 


The Trace counter is cleared by hardware reset. Figure 10-4 illustrates a block diagram 
of the Trace Counter logic. 


10.6 METHODS OF ENTERING THE DEBUG MODE 


Entering the Debug Mode is acknowledged by the chip by toggling the DSO line for 3 T 
cycles. This informs the external command controller that the chip has entered the Debug 
Mode and is waiting for commands. There are seven ways in which the Debug Mode may 
be entered. They are: 


1. External Request During Hardware Reset 
. External Request During Normal Activity 

. External Request During STOP 

. External Request During WAIT 

. Software Request During Normal Activity 


. Enabling Trace Mode 


N OO OO fF WD PY 


. Enabling Breakpoints 


10.6.1 External Request During Hardware Reset 


Holding the DR line asserted during the assertion of RESET will cause the chip to enter 
the Debug Mode. After receiving the acknowledge, the command controller must deassert 
the DR line. Note that in this case the chip does not perform any fetch or memory access 
before entering the Debug Mode. 


10.6.2 External Request During Normal Activity 


Holding the DR line asserted during the normal chip activity will cause the chip to finish 
execution of the current instruction and then enter the Debug Mode. After receiving the 
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acknowledge the command controller must deassert the DR line. Note that in this case 
the chip completes execution of the current instruction and stops after the newly fetched 
instruction enters the instruction latch. This process is the same for any newly fetched in- 
struction including instructions fetched during interrupt processing or instructions that will 
be killed by the interrupt processing. 


10.6.3 External Request During STOP 


Asserting DR when the chip is in the stop state (i.e., it has executed a STOP instruction) 
causes the chip to exit the stop state and enter the Debug Mode. The chip will wake up 
from the stop state normally (finish executing STOP) and halt after the next instruction en- 
ters the instruction latch. After receiving the acknowledge, the command controller must 
deassert DR. Note that in this case the chip completes the execution of the STOP instruc- 
tion and halts after the next instruction enters the instruction latch. 


10.6.4 External Request During WAIT 


Asserting DR when the chip is in the wait state (i.e. has executed a WAIT instruction) 
causes the chip to exit wait state and enter the Debug Mode. The chip will wake up from 
the wait state normally (finish executing WAIT) and halt after the next instruction enters 
the instruction latch. After receiving the acknowledge, the command controller must deas- 
sert DR. Note that in this case the chip completes execution of the WAIT instruction and 
halts after the next instruction enters the instruction latch. 


10.6.5 Software Request During Normal Activity 
Upon executing the DEBUG or DEBUGcc instructions (with condition true for DEBUGcc), 


the chip will enter Debug Mode after the instruction following the DEBUG/DEBUGcc in- 
struction has entered the instruction latch. 


RESET 
RCTR| TRCOTW 


> ae 


[> SER_OUT 
COUNTER |— GLK_IN 


END OF 
INSTRUCTION 


DEC 


) COUNT 0 
L ec) ° > STRACE 


Figure 10-4 Trace Counter Logic 
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10.6.6 Enabling Trace Mode 


When the chip is operating in Trace Mode and the Trace Counter reaches a value of zero, 
the chip will enter the Debug Mode after completing execution of the instruction that 
caused the Trace Counter to decrement. Only those instructions that are actually execut- 
ed may cause the Trace Counter to decrement i.e. a killed instruction (instruction discard- 
ed during the interrupt process) will not decrement the Trace Counter and will not cause 
the chip to enter the Debug Mode. 


10.6.7 Enabling Breakpoints 


The chip will enter the Debug Mode after completing execution of the instruction that 
caused the Breakpoint Counter to decrement when: 


1. operating in the Trace Mode when the Breakpoint Counter has reached zero 
or 


2. when operating in Normal Mode with the Breakpoint mechanism enabled and 
the Breakpoint Counter has reached zero. 


In the case of breakpointing on: 


1. Program memory addresses, the breakpoint will be acknowledged immedi- 
ately after the execution of the instruction accessed at the specified address. 


2. Datamemory addresses the breakpoint will be acknowledged after the com- 
pletion of the instruction following the instruction that caused the access at the 
specified address. 


10.7 PIPELINE INFORMATION 


The previous chip pipeline state must be reconstructed to resume normal chip activity 
when returning from the Debug Mode. Figure 10-5 illustrates a block diagram of Pipeline 
Information Registers. Only the PDB register and the PIL register are used to reconstruct 
the pipeline as it was before debug. the PAB History Buffer, PAB Register for Fetch and 
PAB Register for Decode are only used for status information. When loading a one word 
instruction into the PDB and issuing a GO commana, the hardware internally transfers the 
PDB to the PIL and then executes the instruction. When loading a two word instruction, 
the first word is loaded into the PDB. As the second word is loaded to the PDB, the first 
word is automatically transferred to the PIL and then execution takes place. 
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10.7.1 OnCE PDB Register (OPDBR) 


The PDB Register (OPDBR) is a read/write, 16-bit latch that stores the value of the Pro- 
gram Data Bus generated by the last Program Memory access of the DSP before the De- 
bug Mode is entered. OPDBR is available for read/write operations only through the serial 
interface. This register is affected by the operations performed during the Debug Mode 
and must be restored by the command controller when returning to normal mode. 


10.7.2 OnCE PIL Register (OPILR) 


The OPILR is a read only 16-bit latch that stores the instruction present in the Instruction 
Latch when the Debug Mode is entered. OPILR is available for read operations only 
through the serial interface. If a write is selected for this register, i.e., R\W = 0 and RS4- 
RSO = 01011, then zeros will be shifted into the OPILR. This register is affected by the 
operations performed during the Debug Mode and must be restored by the command con- 
troller when returning to normal mode. Since there is no direct write access to this register, 
this task is accomplished by writing the OPDBR first and then the data from OPDBR is 
latched in OPILR. 


10.7.3. OnCE GDB Register (OGDBR) 


The OGDBR is a read only 16-bit latch that stores the value of the Global Data Bus. OGD- 
BR is available for read operations only through the serial interface. OGDBR is required 
as a means of passing information between the chip and the command controller. OGD- 
BR will be mapped on the X internal IO space at address $F FFF. Whenever the command 
controller needs information such as a register or memory value it will force the chip to 
execute an instruction that brings that information to the OGDBR. Then, the contents of 
the OGDBR will be delivered serially to the command controller by the command “READ 
GDB REGISTER’. 


10.8 PAB HISTORY BUFFER 


To ease the debugging activity and keep track of the program flow, a First-In-First-Out, 
read only, buffer is provided. It stores the addresses of the last five instructions that were 
executed as well as the addresses of the last fetched instruction and of the instruction cur- 
rently in the instruction latch. 


Figure 10-6 illustrates a block diagram of the Program Address Bus FIFO. 


10.8.1 OnCE PAB Register for Fetch (OPABFR) 


The OPABFR is a read only 16-bit latch that stores the address of the last instruction that 
was fetched before the Debug Mode was entered. OPABFR is available for read opera- 
tions only through the serial interface. This register is not affected by the operations per- 
formed during the Debug Mode. 
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Figure 10-5 Pipeline Information Registers 


10.8.2 OnCE PAB Register for Decode (OPABDR) 


The 16-bit OPABDR stores the address of the instruction currently in the Instruction Latch. 
This is the instruction that would have been decoded if the chip would not have entered 
the Debug Mode. OPABDR is available for read operations only through the serial inter- 
face. This register is not affected by the operations performed during the Debug Mode. 


10.8.3 OnCE PAB FIFO 


The FIFO is implemented as a circular buffer containing five 16-bit registers and one 3-bit 
counter. All registers have the same address but any read access to the FIFO will cause 
an increment of the counter thus pointing to the next FIFO register. The registers are se- 
rially available for read to the command controller through their common FIFO address. 
The FIFO is not affected by the operations performed during the Debug Mode except for 
the FIFO pointer increment when reading the FIFO. Figure 10-6 illustrates a block dia- 
gram of the Program Address Bus FIFO. 
Caution 

To ensure FIFO coherence, a complete set of five reads of the FIFO must be 

performed. This is necessary due to the fact that each read increments the 

FIFO pointer thus causing it to point to the next location. After five reads the 

pointer will point to the same location as before starting the read procedure. 
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Figure 10-6 Program Address Bus FIFO 
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RW} GO EX RS4 | RS3 | RS2 | RS1 | RSO 


Figure 10-7 OnCE Command Format 
10.9 SERIAL PROTOCOL DESCRIPTION 


In order to permit an efficient means of communication between the command controller 
and the DSP chip, the following protocol has been adopted. Before starting any debugging 
activity, the command controller has to wait for an acknowledge from the chip which in- 
forms the command controller that it has entered the Debug Mode. Note that in case of a 
breakpoint, trace or software DEBUG/DEBUGcc instruction, the acknowledge itself is the 
one that initiates the debug session. The command controller communicates with the chip 
by sending 8-bit commands that may be accompanied by 16-bit data. After sending a 
command, the command processor starts waiting for the chip to acknowledge execution 
of the command. The command processor may send a new command only after the chip 
has acknowledged execution of the previous command. 


10.9.1 OnCE Commands 


There are two type of commands: read commands (when the chip will deliver required da- 
ta) and write commands (when the chip will receive data and will write it in one of the on 
chip resources). The commands are 8 bits long and have the format shown in Figure 10-7. 


10.9.1.1 OnCE Register Select (RS4-RSO) Bits 0-4 


The Register Select bits define which register is source(destination) for the read(write) op- 
eration. 
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RS4-RSO Register Selected 


00000 Debug Status/Control (OSCR) 

00001 Memory Breakpoint Counter (OMBC) 

00010 Reserved 

00011 Trace Counter (OTC) 

00100 Memory Breakpoint Address (OMBAR) 

00101 Reserved 

00110 Reserved 

00111 Reserved 

01000 Global Data Bus (Transfer) Register (OGDBR) 
01001 Program Data Bus (OPDBR) Register 

01010 Program Address Bus (OPABFR) Latch for Fetch 
01011 Instruction Latch (OPILR) 

01100 Clear Breakpoint Counter 

01101 Reserved 

01110 Clear Trace Counter 

01111 Reserved 

10000 Reserved 

10001 Program Address Bus FIFO and Increment Counter 
10010 Reserved 

10011 Program Address Bus (OPABDR) Latch for Decode 
101xx Reserved 

11xx0 Reserved 

11x0x Reserved 

110xx Reserved 

11111 No Register Selected 


10.9.1.2 ©OnCE Exit Command (EX) Bit 5 


Bit 5 in the OnCE command word is the exit command. To leave the OnCE mode and re- 
enter the normal operating mode, both the EX and GO bits must be asserted in the OnCE 
input command register. There are three exit conditions: 


1. If EX and GO are set, the chip will leave the Debug Mode, execute the DSP 
instruction in the pipeline and then resume normal operation. If the register 
select bits are set to $1F (RS4-RSO = 11111) then the last instruction (the in- 
struction in the PILB) is re-executed. 


2. If EXis set without GO, then when the OnCE has finished writing the instruc- 
tion latch (PILB) register, the OnCE state machine will get another command 
instead of leaving the OnCE mode. 


3. If EX is set without GO, then when the OnCE is finished writing the PDB 
(PILB) register, the OnCE state machine will get another command instead of 
leaving the OnCE mode. 


There is no acknowledgment on the DSO pin when the chip leaves the OnCE mode fol- 
lowing a GO or an EX. 
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Action 


Remain in Debug Mode 
Leave Debug Mode 


10.9.1.3. OnCE Go Command (GO) Bit 6 


If GO is set, execute instruction. There is no acknowledgment on the DSO pin when the 
chip leaves the OnCE mode following a GO or an EX. 


Action 


Inactive (no action taken) 
Execute DSP instruction 


10.9.1.4 OnCE Read/Write Command (R/W) Bit 7 


Action 


Write the data associated with the command into the register specified by RS4-RSO 
Read the data contained in the register specified by RS4-RSO 


10.10 DSP56100 TARGET SITE DEBUG SYSTEM REQUIREMENTS 


A typical debug environment consists of a target system where the DSP resides in the 
user defined hardware. The debug serial port interfaces to the command convertor over 
a six wire link consisting of the four debug serial lines, a ground and reset wire. The reset 
wire is optional and is only used to reset the DSP and its associated circuitry. 


The command controller acts as the medium between the DSP target system and a host 
computer. The host computer interfaces to the controller using a standard RS232 three 
wire cable or the Application Development System parallel bus. A jumper option on the 
command controller board will select which method of communications will be used. This 
allows a variety of different host computers to communicate with the controller circuit. The 
controller circuit provides several important functions. It acts as a serial debug port driver, 
host computer command interpreter, and DSP controller. The DSP acts as a slave when 
in the debug mode and provides data only upon request. The controller issues commands 
based on the host computer inputs from a user interface program which communicates 
with the user. 
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10.11 USING THE OnCE 
The following notations are used: 


Commands require eight clocks 
ACK = Wait for acknowledge on DSO line 


CLK = Issue 16 clocks to read out data from selected register 


10.11.1 Begin Debug Activity 


Debug activity begins on an instruction boundary after the DR pin is asserted, a DEBUGcc 
opcode is executed, a trace countdown occurs, or a breakpoint register countdown oc- 
curs. If the instruction executing when the DR pin is asserted is a REP instruction or the 
instruction following a REP instruction, then the debug activity will begin after the instruc- 
tion following the REP instruction finishes being repeated. The first ACK indicates that the 
OnCE controller is ready to receive commands and data. Most of the Debug activities will 
have the following beginning: 


ACK 


1. Save pipeline information: 
a. Send command READ PDB REGISTER 
b. ACK 
c. CLK 
d. Send command READ OPILR 
e. ACK 
f. CLK 


2. Read PAB FIFO and fetch/decode info (this step is optional): 
a. Send command READ PAB address for fetch 
b. ACK 
c. CLK 
d. Send command READ PAB address for decode 
e. ACK 
f. CLK 
g. Send command READ FIFO REGISTER (and increment pointer) 
h. ACK 
i. CLK 
j. Send command READ FIFO REGISTER (and increment pointer) 
k. ACK 
|. CLK 
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m. Send command READ FIFO REGISTER (and increment pointer) 
n. ACK 
o. CLK 
p. Send command READ FIFO REGISTER (and increment pointer) 
q. ACK 
r. CLK 
s. Send command READ FIFO REGISTER (and increment pointer) 
t. ACK 
u. CLK 


10.11.2 Displaying a Specified Register 
1. Send command WRITE PDB REGISTER and GO (no EX) 
(ODEC selects PDB as destination for serial data.) 
2. ACK 


3. Send the 16-bit opcode: “MOVE reg, x:OGDB 
(After all 16-bits have been received, the PDB register drives the PDB. ODEC 
generates PRNEW and releases the chip from the “halt” state and the contents of 
the register specified in the instruction is loaded in the GDB REGISTER. The 
PRCYC1 signal (an internal signal) that marks the end of the instruction brings the 
chip again in the “halt” state and an acknowledge is issued to the command 
controller) 


4. ACK 


5. Send command READ GDB REGISTER 
(ODEC selects GDB as the source for serial data and an acknowledge is issued to 
the command controller) 


6. ACK 
7. CLK 
10.11.3 Displaying X Memory Area Starting from Address xxxx 


This command uses Rn to minimize serial traffic. 


1. Send command WRITE PDB REGISTER and GO (no EX). 
(ODEC selects PDB as destination for serial data.) 


2. ACK 
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3. Send the 16-bit opcode: “MOVE RO,x:OGDB” 
(After all 16-bits have been received, the PDB register drives the PDB. ODEC 
generates PRNEW and releases the chip from the “halt” state and the contents of 
RO are loaded in the GDB REGISTER. The PRCYC1 signal that marks the end of 
the instruction brings the chip again to the “halt” state and an acknowledge is 
issued to the command controller) 


4. ACK 


5. Send command READ GDB REGISTER 
(ODEC selects GDB as the source for serial data and an acknowledge is issued to 
the command controller) 


6. ACK 


7. CLK 
(The command controller generates 16 clocks that shift out the contents of the 


GDB register. The value of RO is thus saved and will be restored before exiting the 
Debug Mode) 


8. Send command WRITE PDB REGISTER (no GO, no EX). 
(ODEC selects PDB as destination for serial data.) 


9. ACK 


10.Send the 16-bits of opcode: “MOVE #$xxxx,RO0” 
(After all 16-bits have been received, the PDB register drives the PDB. ODEC 
generates PRNEW so the PILR is loaded with the opcode. An acknowledge is 
issued to the command controller) 


11.ACK 


12.Send command WRITE PDB REGISTER and GO (no EX). 
(ODEC selects PDB as destination for serial data.) 


13. ACK 


14. Send the 16-bits of the 2nd word of: “MOVE #$xxxx,RO0” (the xxxx field) where xxxx 
is the address to be read. 
(After all 16-bits have been received, the PDB register drives the PDB. ODEC 
releases the chip from the “halt” state and the instruction starts execution. The 
PRCYC1 signal that marks the end of the instruction brings the chip again to the 
“halt” state and an acknowledge is issued to the command controller) 
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15.ACK 


16.Send command WRITE PDB REGISTER and GO (no EX). 
(ODEC selects PDB as destination for serial data.) 


17. ACK 

18.Send the 16-bit opcode: “MOVE X:(R0)+,x:OGDB” 
(After all 16-bits have been received, the PDB register drives the PDB. ODEC 
generates PRNEW and releases the chip form the “halt” state and the contents of 
X:(RO) are loaded in the GDB REGISTER. The PRCYC1 signal that marks the end 
of the instruction brings the chip again in the “halt” state and an acknowledge is 
issued to the command controller) 


19. ACK 


20.Send command READ GDB REGISTER 
(ODEC selects GDB as source for serial data and an acknowledge is issued to the 
command controller) 


21.ACK 
22.CLK 


23.Send command NO SELECTION and GO (no EX). 
(ODEC releases the chip from the “halt” state and the instruction is executed once 
again (in a “REPEAT-like” fashion. The PRCYC1 signal that marks the end of the 
instruction brings the chip again to the “halt” state and an acknowledge is issued 
to the command controller.) 


24. ACK 


25.Send command READ GDB REGISTER 
(ODEC selects GDB as source for serial data and an acknowledge is issued to the 
command controller.) 


26. ACK 
27.CLK 


28. Repeat from step 23 until the entire memory area is examined. At the end of the 
process RO has to be restored. 
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10.11.4 Returning from Debug Mode to Normal Mode 


There are two cases for returning from the debug mode. In case 1, control will be returned 
to the program that was running before debug was initiated and in case 2, the registers 
will be changed to jump to a different program. There is no acknowledgment on the DSO 
pin when the chip leaves the OnCE mode following a GO, EX. This is a special case of 
the “write a register” option. 


10.11.4.1 Case 1: Returning from Debug Mode to Normal Mode 
1. Send command WRITE PDB REGISTER (no GO, no EX). 
(ODEC selects the PDB register as destination for serial data. Also ODEC selects 


the on-chip PAB register as source for the PAB bus. After the PAB was driven an 
acknowledge is issued to the command controller) 


2. ACK 
3. Send the 16-bits of the saved PILB (instruction latch) value. 


(After all 16-bits have been received, the PDB register drives the PDB. ODEC 
generates PRNEW so the entire chip loads the opcode. An acknowledge is issued 
to the command controller) 


4. ACK 


5. Send command WRITE PDB REGISTER (GO, EX). 
(ODEC selects PDB as destination for serial data.) 


6. ACK 


7. Send the 16-bits of the saved PDB value. 
(After all 16-bits have been received, the PDB register drives the PDB. ODEC 
releases the chip form the “halt” state and the Debug Mode bit in OSCR is cleared. 
The chip continues to execute instructions until a Debug Mode condition occurs) 


10.11.4.2 Case 2: Jump to a New Program (Go from Address $xxxx). 


1. Send command WRITE PDB REGISTER (no GO, no EX). 
(ODEC selects PDB as destination for serial data.) 


2. ACK 
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3. Send 16 bits of the opcode of a two word jump instruction instead of the saved PIL 
(instruction latch) value. 


(After all the 16-bits have been received, the PDB register drives the PDB. ODEC 
causes the DSP to load the opcode. An acknowledge is issued to the command 
controller.) 


4. ACK 


5. Send command WRITE PDB REGISTER (GO, EX). 
(ODEC selects PDB as destination for serial data.) 


6. ACK 


7. Send 16 bits of the target absolute address ($xxxx). The chip will resume fetching 
from the target address (you do not have to worry about the pipeline). Note that the 
trace counter will count this instruction so the current trace counter may need to be 
corrected if the trace mode enable bit in the OSCR has been set. 


(e. g., After 16 bits have been received, the PDB register drives the PDB. ODEC 
releases the chip from the “halt” state and the Debug Mode bit in OSCR is cleared. 
The chip executes first the jump instruction and will then fetch the instruction from 
the target address. The chip continues to execute instructions from that address 
until a Debug Mode condition occurs.) 
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11.1 SOFTWARE 

All software support products run on the following platforms — IBM™ PC, Macintosh™, 
and SUN™ workstation. The software, written in C, consists of an assembler, linker, and 
simulator which are marketed as an integrated product. 


11.2 MACRO CROSS ASSEMBLER 

The ASM56100 Macro Cross Assembler program is a full-featured macro cross assem- 
bler that translates one or more source fields containing DSP instruction mnemonics, 
operands, and assembler directives into relocatable object modules that are relocated 
and linked by the Motorola DSP Linker in the Relocation mode. In the Absolute mode, 
the assembler will generate absolute executable files. The assembler recognizes the full 
instruction set and all addressing modes of the DSP56100 family. 


This assembler offers the usual complement of features found in modern assemblers, 
such as conditional assembly, file inclusion, nested macros with support for macro librar- 
ies (via the MACLIB directive), and modular programming constructs ordinarily found 
only in higher level languages. 


The unique architecture and parallel operation of the DSP demands special purpose 
facilities and programming aids which this assembler readily provides. These include 
built-in functions for common transcendental math computations such as sine, cosine, 
log, and square root functions; arbitrary expressions and modulo operations; and direc- 
tives to define circular and bit-reversed data buffers. Moreover, the assembler incorpo- 
rates extensive error checking and reporting to indicate programming violations peculiar 
to the digital signal processing environment or stemming from the advanced features of 
the DSP. These include errors for improper nesting of hardware DO loops and improper 
address boundaries for circular data buffers and bit-reversed buffers. 


The assembler also generates source code listings which include numbered source 
lines, optional titles and subtitles, optional instruction cycle counts, symbol table and 
cross-reference listings, and memory use reports. 


To summarize, features of the assembler are: 


¢ Produces relocatable object modules compatible with the DSP linker program in 
the Relocation mode 


¢ Produces absolute executable files compatible with the Simulator program 
(SIM56100) in the Absolute mode 


¢ Supports full instruction set, memory spaces, and parallel data transfer fields of the 


¢ Modular programming features including local labels, sections, and external 
definition/reference directives 
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¢ Nested macro libraries 
* Complex expression evaluation including boolean operators 


¢ Built-in functions for data conversion, string comparison, and common transcendental 
math operations 


¢ Directives to define circular and bit-reversed buffers 
¢ Extensive error checking and reporting 


11.3. LINKER/LIBRARIAN 

The linker relocates and links relocatable object modules from the Macro Cross Assem- 
bler to create an absolute executable file which can be loaded directly into the 
DSP56100 simulator or converted to Motorola S-record format for PROM burning. 


The librarian utility will merge into a single file multiple separate relocatable object mod- 
ules. This facilitates not having to reassemble known bug-free routines every time the 
mainline program is assembled. 


11.4 SIMULATOR PROGRAM 

The SIM56100 Simulator program is a software tool for developing programs and algo- 
rithms for the DSP. This program exactly emulates all of the functions (except for the 
OnCE) of the DSP including all on-chip peripheral operations, the entire internal and 
external memory space, all memory and register updates associated with program code 
execution, and all exception processing activity. This enables the Simulator program to 
provide an accurate measurement of code execution time which is so critical in digital 
signal processing applications. 


The Simulator program executes DSP object code generated by the Linker or the Simu- 
lator’s internal single-line assembler. The object code is loaded into the simulated DSP 
memory map. Instruction execution can proceed until a user-defined breakpoint is 
encountered; or in single-step mode, stopping after each instruction has been executed. 
During program debug, the registers or memory locations may be displayed or changed. 


The Simulator package includes linkable object code libraries of simulator functions that 
were used to create the simulator. The libraries allow a customized simulator to be built 
and integrated with unique system simulations. Source code for some of the functions, 
such as the terminal I/O functions and external memory accesses, is provided to allow 
close simulation of the particular application. 


To summarize, features of the Simulator program are: 


Summary of simulator features: 
¢ Multiple device simulation 
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¢ Source level symbolic debug of assembly source programs 

* Conditional or unconditional breakpoints 

¢ Program patching using a Single-Line Assembler/Disassembler 
¢ Instruction and Cycle timing counters 

* Session and/or Command Logging for later reference 

* Input/Output ASCII files for device peripherals 

* Help file and Help line display of Simulator commands 

¢ Macro command definition and execution 

¢ Display Enable/Disable of Registers and Memory 

* Hexadecimal/Decimal/Binary calculator 


11.5 HARDWARE 

Each DSP56100 family member has an Application Development System (ADS). All of 
these are essentially identical in operation and features. The differences that do exist are 
due to the specific nature of each chip. While the example here is the DSP56156, all 
DSP56100 family ADS’s operate in essentially the same way. Upgrading an ADS to run 
a different Motorola DSP is done by purchasing and plugging in a new Application Devel- 
opment Module (see Figure 11-1). 


The DSP56156 ADS is a four component system which acts as a development tool for 
designing, debugging, and evaluating real-time DSP56156 target system equipment. 
The ADS simplifies evaluation of the user’s prototype hardware/software product by 
making all of the essential DSP56156 timing and I/O circuitry easily accessible. The ADS 
takes full advantage of the On-Chip Emulation (OnCE) circuits of the DSP to allow the 
user to control the target non-intrusively. 


An IBM PC, Macintosh II, or SUN acts as the medium between the user and the DSP 
hardware. The four components consist of an Application Development Module (ADM) 
which contains a DSP56156 processor and control circuitry, a HOST-BUS interface 
board for controlling up to 8 ADMs, a command convertor board which interacts with the 
target OnCE serial debug port, and a software program which interacts with the user and 
controls the ADM(s) and/or target system. 


DSP algorithm development is simplified with features such as multiple file I/O capability 
to the target under DSP56156 program control and immediate access to a hex/fractional 
arithmetic calculator. The ADS is fully compatible with the DSP56100CLASx design-in 
software package and may act as an accelerator for testing DSP56156 algorithms. 
DSP56156 programs may be executed in real-time or by single/multiple stepping through 
instructions. 
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As many as 99 conditional and/or unconditional software breakpoints may be placed in 
ADM program memory. A hardware breakpoint range may be set to halt program execu- 
tion whenever a program or data address falls within the specified range. All breakpoints 
may have actions associated with them or may cause an immediate halt and display of 
enabled registers. 


Figure 11-1 illustrates the ADS being used as a hardware evaluation tool or software 
accelerator. The ADM card has a 10 pin connector which provides an access point for 
the command convertor OnCE interface. 


Figure 11-2 illustrates the ADS being used as an emulator where the user has a defined 
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Figure 11-1 Application Development 
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Figure 11-2 Target Circuit Emulation 
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target system and needs to debug the hardware or software without any special target 
footprint cable which could be intrusive or limiting. Here the user must provide an access 
point for the 10 pin OnCE interface cable. This may be a simple 2 row x 5 set of test 
points. 


The ADM hardware, as illustrated in Figure 11-3, provides up to 64K words of user-con- 
figurable high-speed SRAM with no wait states required on the external bus of the 
DSP56156. There are also sockets for 2K to 8K words of user-program EPROM on the 
external bus. The ADM provides easy access to all DSP56156 pins via a 96-pin Euro- 
card male connector as well as a 96 pin Berg male stake connector. This enables the 
user to design full-speed application circuits which may be connected to the DSP using 
standard Euro-card prototype boards. 


Emulation of a target system is made easy by disconnecting the command convertor 
board from the ADM and connecting the 10 pin OnCE serial port cable to the target sys- 
tem. This allows the user to control the target system non-intrusively so that real-time 
execution may achieved at the maximum clock frequency of the DSP56156. 


11.6 HARDWARE FEATURES 
¢ Full speed operation 
¢ Multiple ADM support with programmable 
¢* ADM addressing 8K Words of Configurable Static RAM expandable to 
64K words. 


2-8K 16-64K 
EPROM SRAM 
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Expansion 
Connector 
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Reset/ 
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Mode Control 


| | 


r OnCE 


Figure 11-3 Application Development Module 
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¢ 2K Words of EPROM with sockets expandable to 64K words. 

* Stand-Alone operation of ADM after initial development. 

¢ Full support of program/data memory maps. 

* 96 pin Connector provides access to all DSP56156 pins. 

* OnCE Command Convertor card for non-intrusive Real Time Emulation. 

¢ Special peripheral connectors available for easy access to DSP 
peripherals. 

¢ 3V emulation support in target environments 


11.7 SOFTWARE FEATURES 
¢ Single/Multiple stepping through DSP56156 object programs. 
¢ Conditional or unconditional software and hardware breakpoints. 
¢ Program patching using a Single-Line Assembler/Disassembler. 
¢ Session and/or Command Logging for later reference. 
¢ Loading and Saving of files to/from ADM Memory. 
¢ Macro command definition and execution. 
¢ Display Enable/Disable of Registers and Memory. 
¢ Debug commands which support Multiple DSP56156 development. 
¢ Hexadecimal/Decimal/Binary Fractional calculator. 
¢ System commands from within ADS User Interface Program. 
¢ Multiple Input/Output file access from DSP56156 object programs. 
¢ On-line help screens for each command and DSP56156 register. 
¢* Compatible with the DSP56100CLASX Assembler and Simulator 


11.8 OPERATING ENVIRONMENT 

The minimum hardware requirements for the DSP56156ADS User Interface Program 
include: IBM PC-DOS/MS-DOS v3.x, 4.x, or 5.x; Macintosh II with 1 Mbyte of RAM and 
running Mac OS 4.2 or later; or SUN-4 running BSD 4.2 with SUNOS 4.1.2 or Solaris 2.x. 
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12.1 INTRODUCTION 

This section is intended as a guide to the DSP support services and products offered by 
Motorola. This includes training, development hardware and software tools, telephone 
support, etc. 


12.2 THIRD PARTY SUPPORT 
User support from the conception of a design through completion is available from Motor- 
ola and third-party companies as shown in the following list: 


Motorola Third Party 
Design Data Sheets Data Acquisition Packages 
Application Notes Filter Design Packages 
Application Bulletins Operating System Software 
Software Examples Simulator 
Prototyping Assembler Logic Analyzer with 
Linker DSP561xx ROM Packages 
C Compiler Data Acquisition Cards 
Simulator DSP Development System 
Application Development Cards 
System (ADS) Operating System Software 
In-Circuit Emulator Debug Software 
Cable for ADS 
Design Application Development Data Acquisition Packages 
Verification System (ADS) Logic Analyzer with 
In-Circuit Emulator DSP561xx ROM Packages 
Simulator Data Acquisition Cards 
DSP Development System 
Cards 


Application-Specific 
Development Tools 
Debug Software 


Specific information on the companies that offer these products is available by calling the 
DSP third party information number given in Section 12.10. 
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The following is a partial list of the support available for the DSP561xx. Additional 
information on DSP56100 family members can be obtained through Dr. BuB or the 
appropriate support telephone service. 


12.3 ©MOTOROLA DSP PRODUCT SUPPORT 

* DSP56100CLASx Design-In Software Package which includes: 
Relocatable Macro Assembler 
Linker 
Simulator (simulates single or multiple DSP561xxs) 
Librarian 

* DSP561xx Applications Development System (ADS) 

¢ Support Integrated Circuits 

¢ DSP Bulletin Board (Dr. BuB) 

¢ Motorola DSP Newsletter 


¢ Motorola Technical Service Engineers (TSEs) 
See your local telephone directory for the Motorola Semiconductor Sector sales 
office telephone number. 


* Design Hotline 

* Applications Assistance 

¢ Marketing Information 

¢ Third-Party Support Information 
¢ University Support Information 


12.3.1 DSP56100CLASx Assembler/Simulator 


12.3.1.1| Macro Cross Assembler and Simulator Platforms 
1. IBM™ PCs and clones using an 80386 or upward compatible processor 
2. Macintosh™ computers with a NU-BUS™ expansion port 
3. SUN computer 


12.3.1.2 Macro Cross Assembler Features 
¢ Production of relocatable object modules compatible with linker program when in 
relocatable mode 


¢ Production of absolute files compatible with simulator program when in absolute 
mode 


¢ Supports full instruction set, memory spaces, and parallel data transfer fields of 
the DSP561xx 
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¢ Modular programming features: local labels, sections, and external definition/ref- 
erence directives 


* Nested macro processing capability with support for macro libraries 
* Complex expression evaluation including boolean operators 


¢ Built-in functions for data conversion, string comparison, and common transcen- 
dental math functions 


* Directives to define circular and bit-reversed buffers 
¢ Extensive error checking and reporting 


12.3.1.3 Simulator Features 
¢ Simulation of all DSP56100 family DSPs 


¢ Simulation of multiple DSP56100 family DSPs 
¢ Linkable object code modules: 
—Nondisplay simulator library 
—Display simulator library 
* C language source code for: 
—Screen management functions 
—Terminal I/O functions 
—Simulation examples 
* Single stepping through object programs 
* Conditional or unconditional breakpoints 
¢ Program patching using a single-line assembler/disassembler 
¢ Instruction, clock cycle, and histogram counters 
¢ Session and/or command logging for later reference 
¢ ASCII input/output files for peripherals 
¢ Help-line display and expanded on-line help for simulator commands 
¢ Loading and saving of files to/from simulator memory 
¢ Macro command definition and execution 
¢ Display enable/disable of registers and memory 
* Hexadecimal/decimal/binary calculator 


12.3.2 Application Development Systems 
* Application Development Systems (ADS) are available for all family members. Up- 
grading an ADS to run a different Motorola DSP is done by purchasing and plug- 
ging in a new Application Development Module. 
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12.3.2.1 | DSP561xxADSx Application Development System Hardware Features 
¢ Full-speed operation 


¢ Multiple application development module (ADM) support with programmable ADM 
addresses 


¢ User-configurable RAM for DSP561xx code development 

¢ Expandable monitor ROM 

* 96-pin Euro-card connector making all pins accessible 

¢ In-circuit emulation capabilities using OnCE 

¢ Separate berg pin connectors for alternate accessing of serial or host/DMA ports 
¢ ADM can be used in stand-alone configuration 

* No external power supply needed when connected to a host platform 

¢ 3V emulation support in target environments 


12.3.2.2 DSP561xxADSx Application Development System Software Features 
* Full-speed operation 


¢ Single/multiple stepping through DSP561 xx object programs 

¢ Up to 99 conditional or unconditional breakpoints 

¢ Program patching using a single-line assembler/disassembler 

¢ Session and/or command logging for later reference 

¢ Loading and saving files to/from ADM memory 

* Macro command definition and execution 

* Display enable/disable of registers and memory 

¢* Debug commands supporting multiple ADMs 

* Hexadecimal/decimal/binary calculator 

¢ Host operating system commands from within ADS user interface program 
¢ Multiple OS I/O file access from DSP561xx object programs 

¢ Fully compatible with the DSP56100CLASx design-in software package 
* On-line help screens for each command and DSP561xx register 


12.4 SUPPORT INTEGRATED CIRCUITS 
* DSP56ADC16 16-bit, 100-kHz analog-to-digital converter 
¢ DSP56401 AES/EBU processor 
* DSP56200 FIR filter 
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12.5 ©MOTOROLA DSP NEWS 

The Motorola DSP News is a quarterly newsletter providing information on new products, 
application briefs, questions and answers, DSP product information, third-party product 
news, etc. This newsletter is free and is available upon request by calling the marketing 
information phone number listed below. 


12.6 © MOTOROLA FIELD APPLICATION ENGINEERS 

Information and assistance for DSP applications is available through the local Motorola 
field office. See your local telephone directory for telephone numbers or call (512)891- 
2030. 


12.7. DSP APPLICATIONS HELP LINE — (512) 891-3230 
Design assistance for specific DSP applications is available by calling this number. 


12.8 DESIGN HOTLINE - 1-800-521-6274 
This is the Motorola number for information pertaining to any Motorola product. 


12.9 DSP MARKETING INFORMATION - (512) 891-2030 
Marketing information including brochures, application notes, manuals, price quotes, etc. 
for Motorola DSP-related products are available by calling this number. 


12.10 DSP THIRD-PARTY SUPPORT INFORMATION -— (512) 891-3098 
Information concerning third-party manufacturers using and supporting Motorola DSP 
products is available by calling this number. Third-party support includes: 


Filter design software 
Logic analyzer support 
Boards for VME, IBM-PC/XT/AT, MACII, SPARC, HP300 
Development systems 
Data conversion cards 
Operating system software 
Debug software 
Additional information is available on Dr. BuB and in DSP News. 


12.11 DSP UNIVERSITY SUPPORT -— (512) 891-3098 
Information concerning university support programs and university discounts for all 
Motorola DSP products is available by calling this number. 
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12.12 DSP TRAINING COURSES - (602) 897-3665 or (800) 521-6274 
Training information on the DSP56100 family members is available by writing: 


Motorola SPS Training and Technical Operations 
Mail Drop EL524 

P. O. Box 21007 

Phoenix, Arizona 85036 


or by calling the number above. A technical training catalog is available which describes 
these courses and gives the current training schedule and prices. 


12.13. Dr. BuB ELECTRONIC BULLETIN BOARD 

Dr. BuB is an electronic bulletin board providing free source code for a large variety of 
topics that can be used to develop applications with Motorola DSP products. The software 
library includes files including FFTs, FIR filters, IIR filters, lattice filters, matrix algebra 
routines, companding routines, floating-point routines, and others. In addition, the latest 
product information and documentation (including information on new products and 
improvements on existing products) is posted. Questions concerning Motorola DSP 
products posted on Dr. BuB are answered promptly. 


Dr. BuB is open 24-hour a day, 7 days per week and offers the DSP community informa- 
tion on Motorola’s DSP products, including: 


¢ Public domain source code for Motorola’s DSP products including the DSP56000 
family, the DSP56100 family and the DSP96002 


¢ Announcements about new products and policies 

¢ Technical discussion groups monitored by DSP application engineers 
¢ Confidential mail service 

* Calendar of events for Motorola DSP 

* Complete list of Motorola DSP literature and ordering information 

¢ Information about the Third-Party and University Support Programs. 


To logon to the bulletin board, follow these instructions: 


1. Set the character format on your modem to 8 data bits, no parity, 1 stop bit, 
then dial (512) 891-3771. Dr. BuB will automatically set the data transfer rate 
to match your modem (9600, 4800, 2400, 1200 or 300 BPS). 


2. Once the connection has been established, you will see the Dr. BuB login 
prompt (you may have to press the carriage return a couple times). If you just 
want to browse the system, login as guest. If you would like all the privileges 
that are normally allowed on the system, enter new at the login prompt. 
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3. If you open a new account, you will be asked to answer some questions such 
as name, address, phone number, etc. After answering these questions, you 
will have immediate access to all features of the system including download 
privilege, electronic mail and participation in discussion groups. 


4. You will have an hour of access time for each call (upload and download time 
doesn’t count against you) and you can call as often as you like. If you need 
more time on line, just send an electronic mail request to the system operator 
(sysop). 


The following is a partial list of the software available on Dr. BuB. 


Document ID_ | Version Synopsis Size 
12.13.1 Audio 
rvb1.asm 1.0 Easy-to-read reverberation routine 17056 
rvb2.asm 1.0 Same as RVB1.ASM but optimized 15442 
stereo.asm 1.0 Code for C-QUAM AM stereo decoder 4830 
stereo.hlp 1.0 Help file for STEREO.ASM 620 
dge.asm 1.0 Digital Graphic Equalizer code from 14880 


12.13.2 Benchmarks 
Appendix B.1 through B.2.26 DSP56116 (DSP56100 Family) Benchmarks 44436 


Appendix B.3 through B.3.9 DSP56116 (DSP56100 Family) Benchmarks 6329 


12.13.3 Codec Routines 


loglin.asm 1.0 Companded CODEC to linear PCM data 4572 
conversion 
loglin.hlp Help for loglin.asm 1479 
loglint.asm 1.0 Test program for loglin.asm 2184 
loglint.hlp Help for loglint.asm 1993 
linlog.asm 1.1 Linear PCM to companded CODEC data 4847 
conversion 
linlog.hlp Help for linlog.asm 1714 
12.13.4 DTMF Routines 
clear.cmd 1.0 Explained in read.me file 119 
data.lod 1.0 421 
det.asm 1.0 Subroutine used in IIR DTMF 5923 
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ditmf.asm 1.0 Main routine used in IIR DTMF 10685 
dtmf.mem 1.0 Memory for DTMF routine 48 
dtmfmstr.asm 1.0 Main routine for multichannel DTMF 7409 
dtmfmstr.mem 1.0 Memory for multichannel DTMF routine 41 
dtmftwo.asm 1.0 10256 
ex56.bat 1.0 94 
genxd.lod 1.0 Data file 183 
genyd.lod 1.0 Data file 180 
goertzel.asm 1.0 Goertzel routine 4393 
goertzel.Ink 1.0 Link file for Goertzel routine 6954 
goerizel.lst 1.0 List file for Goertzel routine 11600 
load.cmd 1.0 46 
tstgoert.mem 1.0 Memory for Goertzel routine 384 
sub.asm 1.0 Subroutine linked for use in IIR DTMF 2491 
read.me 1.0 Instructions 738 


12.13.5 Fast Fourier Transforms 


sincos.asm 1.2 Sine-Cosine Table Generator for FFTs 1185 
sincos.hlp Help for sincos.asm 887 
sinewave.asm 1.1 Full-Cycle Sine wave Table Generator 1029 
Generator Macro 
sinewave.hlp for sinewave.asm 1395 
fftr2a.asm 1.1 Radix 2, In-Place, DIT FFT (smallest) 3386 
fftr2a.hlp Help for fftr2a.asm 2693 
fftr2at.asm 1.1 Test Program for FFTs (fftr2a.asm) 999 
fftr2at.hlp Help for fftr2at.asm 563 
fftr2b.asm 1.1 Radix 2, In-Place, DIT FFT (faster) 4290 
fftr2b.hlp Help for fftr2b.asm 3680 
fftr2c.asm 1.2 Radix 2, In-Place, DIT FFT (even faster) 5991 
fftr2c.hlp Help for fftr2c.asm 3231 
fftr2d.asm 1.0 Radix 2, In-Place, DIT FFT (using 3727 
DSP56001 sine-cosine ROM tables) 
fftr2d.hlp Help for fftr2d.asm 3457 
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| Document ID | Version Synopsis Size 
fftr2dt.asm 1.0 Test program for fftr2d.asm 1287 
fftr2dt.hlp Help for fftr2dt.asm 614 
fftr2e.asm 1.0 1024 Point, Non-In-Place, FFT (3.39ms) 8976 
fftr2e.hlp Help for fftr2e.asm 5011 
fftr2et.asm 1.0 Test program for fftr2e.asm 984 
fftr2et.hlp Help for fftr2et.asm 408 
dcti.asm 1.1 Discrete Cosine Transform using FFT 5493 
dct1.hlp 1.1 Help file for dct1.asm 970 
fftr2cc.asm 1.0 Radix 2, In-place Decimation-in-time 6524 
complex FFT macro 
fftr2cc.hlp 1.0 Help file for fftr2cc.asm 3533 
fftr2cn.asm 1.0 Radix 2, Decimation-in-time Complex FFT 6584 
macro with normally ordered input/output 
fftr2cn.hlp 1.0 Help file for fftr2cn.asm 2468 
fftr2en.asm 1.0 1024 point, not-in-place, complex FFT 9723 
macro with normally ordered input/output 
fftr2en.hlp 1.0 Help file for fftr2en.asm 4886 
dhit1.asm 1.0 Routine to compute Hilbert transform 1851 
in the frequency domain 
dhit1 .hlp 1.0 Help file for dhit1.asm 1007 
fftr2bf.asm 1.0 Radix-2, decimation-in-time FFT with 13526 
block floating point 
fftr2bf.hlp 1.0 Help file for fftr2bf.asm 1578 
fftr2aa.asm 1.0 FFT program for automatic scaling 3172 
12.13.6 Filters 
fir.asm 1.0 Direct Form FIR Filter 545 
fir.hlp Help for fir.asm 2161 
firt.asm 1.0 Test program for fir.asm 1164 
iir1.asm 1.0 Direct Form Second Order All Pole 656 
IIR Filter 
iir1 .Alp Help for iir1.asm 1786 
iirit.asm 1.0 Test program for iir1.asm 1157 
iir2.asm 1.0 Direct Form Second Order All Pole 801 
IIR Filter with Scaling 
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| Document ID | Version Synopsis Size 
iir2.hlp Help for tir2.asm 2286 
iir2t.asm 1.0 Test program for iir2.asm 1311 
iir3.asm 1.0 Direct Form Arbitrary Order All 776 
Pole IIR Filter 
iir3.Alp Help for iir3.asm 2605 
iir3t.asm 1.0 Test program for iir3.asm 1309 
iir4.asm 1.0 Second Order Direct Canonic IIR Filter 713 
(Biquad IIR Filter) 
iir4.hlp Help for iir4.asm 2255 
iir4t.asm 1.0 Test program for iir4.asm 1202 
iir5.asm 1.0 Second Order Direct Canonic IIR Filter 842 
with Scaling (Biquad IIR Filter) 
iir5.hlp Help for iir5.asm 2803 
iir5t-asm 1.0 Test program for iir5.asm 1289 
iir6.asm 1.0 Arbitrary Order Direct Canonic IIR 923 
Filter 
iir6.hlp Help for iir6.asm 3020 
iir6t.asm 1.0 Test program for iir6.asm 1377 
iir7.asm 1.0 Cascaded Biquad IIR Filters 900 
iir7.Alp Help for iir7.asm 3947 
iir7t.asm 1.0 Test program for iir7.asm 1432 
Ims.hlp 1.0 LMS Adaptive Filter Algorithm 5818 
transiir.asm 1.0 Implements the transposed IIR filter 1981 
transiir.hlp 1.0 Help file for transiir.asm 974 


12.13.7 Floating-Point Routines 


fodef.hlp 2.0 Storage format and arithmetic 10600 
representation definition 
fpcalls.hlp 2.1 Subroutine calling conventions 11876 
fplist.asm 2.0 Test file that lists all subroutines 1601 
fprevs.hlp 2.0 Latest revisions of floating-point lib 1799 
fpinit.asm 2.0 Library initialization subroutine 2329 
fpadd.asm 2.0 Floating point add 3860 
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fpsub.asm 2.1 Floating point subtract 3072 
fpcmp.asm 2.1 Floating point compare 2605 
fpmpy.asm 2.0 Floating point multiply 2250 
fpmac.asm 2.1 Floating point multiply-accumulate 2712 
fpdiv.asm 2.0 Floating point divide 3835 
fpsqrt.asm 2.0 Floating point square root 2873 
fpneg.asm 2.0 Floating point negate 2026 
fpabs.asm 2.0 Floating point absolute value 1953 
fpscale.asm 2.0 Floating point scaling 2127 
fpfix.asm 2.0 Floating to fixed point conversion 3953 
fpfloat.asm 2.0 Fixed to floating point conversion 2053 
fpceil.asm 2.0 Floating point CEIL subroutine 1771 
fpfloor.asm 2.0 Floating point FLOOR subroutine 2119 
durbin.asm 1.0 Solution for LPC coefficients 5615 
durbin.hlp 1.0 Help file for DURBIN.ASM 2904 
fpfrac.asm 2.0 Floating point FRACTION subroutine 1862 
12.13.8 Functions 
log2.asm 1.0 Log base 2 by polynomial 1118 
approximation 
log2.hip Help for log2.asm 719 
log2t.asm 1.0 Test program for log2.asm 1018 
log2nrm.asm 1.0 Normalizing base 2 logarithm macro 2262 
log2nrm.hlp Help for log2nrm.asm 676 
log2nrmt.asm 1.0 Test program for log2nrm.asm 1084 
exp2.asm 1.0 Exponential base 2 by polynomial 926 
approximation 
exp2.hlp Help for exp2.asm 759 
exp2t.asm 1.0 Test program for exp2.asm 1019 
sqrt1.asm 1.0 Square Root by polynomial 991 
approximation, 7 bit accuracy 
sqrti.hip Help for sqrt1.asm 7 
sqrtit.asm 1.0 Test program for sqrt1.asm 1065 
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| Document ID | Version Synopsis Size 
sqrt2.asm 1.0 Square Root by polynomial 899 
approximation, 10 bit accuracy 
sqrt2.hlp Help for sqrt2.asm 776 
sqrt2t.asm 1.0 Test program for sqrt2.asm 1031 
sqrt3.asm 1.0 Full precision Square Root Macro 1388 
sqrt3.hlp Help for sqrt3.asm 794 
sqrt3t.asm 1.0 Test program for sqrt3.asm 1053 
tli.asm 1.1 Linear table lookup/interpolation 3253 
routine for function generation 
tli.hlp 1.1 Help for tli.asm 1510 
bingray.asm 1.0 Binary to Gray code conversion macro 601 
bingrayt.asm 1.0 Test program for bingray.asm 991 
rand1.asm 1.1 Pseudo Random Sequence Generator 2446 
rand1.hlp Help for rand1.asm 704 
12.13.9 Lattice Filters 
latfird asm 1.0 Lattice FIR Filter Macro 1156 
latfir1 .hlp Help for latfir1.asm 6327 
latfirdt.asm 1.0 Test program for latfir1 .asm 1424 
latfir2.asm 1.0 Lattice FIR Filter Macro 1174 
(modified modulo count) 
latfir2.hlp Help for latfir2.asm 1295 
latfir2t.asm 1.0 Test program for latfir2.asm 1423 
latiir.asm 1.0 Lattice IIR Filter Macro 1257 
latiir.hlp Help for latiir.asm 6402 
latiirt.asm 1.0 Test program for latiir.asm 1407 
latgen.asm 1.0 Generalized Lattice FIR/IIR 1334 
Filter Macro 
latgen.hlp Help for latgen.asm 5485 
latgent.asm 1.0 Test program for latgen.asm 1269 
latnrm.asm 1.0 Normalized Lattice IIR Filter Macro 1407 
latnrm.hip Help for latnrm.asm 7475 
latnrmt.asm 1.0 Test program for latnrm.asm 1595 
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12.13.10 Matrix Operations 
matmul1.asm 1.0 [1x3][8x3]=[1 x3] Matrix Multiplication 1817 
matmul1 .hip Help for matmul1.asm 527 
matmul2.asm 1.0 General Matrix Multiplication, C=AB 2650 
matmul2.hlp Help for matmul2.asm 780 
matmul3.asm 1.0 General Matrix Multiply-Accumulate, 2815 
C=AB+Q 
matmul3.hlp 1.0 Help for matmul3.asm 865 
12.13.11 Reed-Solomon Encoder 
readme.rs 1.0 Instructions for Reed-Solomon coding 5200 
rscd.asm 1.0 Reed-Solomon coder for DSP56000 simulator 5822 
newc.c 1.0 Reed-Solomon coder coded in C 4075 
table1.asm 1.0 Include file for R-S coder 7971 
table2.asm 1.0 Include file for R-S coder 4011 


12.13.12 Sorting Routines 


sorti.asm 1.0 Array Sort by Straight Selection 1312 
sort1.hlp Help for sort1.asm 1908 
sorti1t.asm 1.0 Test program for sort1.asm 689 
sort2.asm 1.1 Array Sort by Heapsort Method 2183 
sort2.hlp Help for sort2.asm 2004 
sort2t.asm 1.0 Test program for sort2.asm 700 


12.13.13 Speech 


Igsol1.asm 2.0 Leroux-Gueguen solution for PARCOR 4861 
(LPC) coefficients 

Igsol1.hlp Help for Igsol1.asm 3971 

durbin1.asm 1.2 Durbin Solution for PARCOR 6360 
(LPC) coefficients 

durbin1 .hlp Help for durbin1.asm 3616 

adpcm.asm 1.0 32 kbits/s CCITT ADPCM Speech Coder 120512 

adpcm.hlp 1.0 Help file for adpcm.asm 14817 

adpcmns.asm 1.0 Nonstandard ADPCM source code 54733 

adpcmns.hlp 1.0 Help file for adocmns.asm 9952 
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g/22.zip 1.14 G.722 Speech Processing Code 235864 
(pkzip file for PC) 

g722.tar.Z 1.11 G.722 Speech Processing Code 339297 


(Compressed tar file for Unix) 


12.13.14 Standard I/O Equates 


ioequi6.asm 1.1 DSP56100 Standard I/O Equate File 10329 
ioequ.asm 1.1 Motorola Standard I/O Equate File 8774 
ioequic.asm 1.1 Lower Case Version of ioequ.asm 8788 
intequ.asm 1.0 Standard Interrupt Equate File 1082 
intequic.asm 1.0 Lower Case Version of intequ.asm 1082 


12.13.15 Tools and Utilities 


srec.c 4.10 Utility to convert DSP56000 OMF format 38975 
to SREC. 

srec.doc 4.10 Manual page for srec.c. 7951 

srec.h 4.10 Include file for srec.c 3472 

srec.exe 4.10 Srec executable for IBM PC 22065 

sloader.asm 1.1 Serial loader from the SCI port for the 3986 
DSP56001 

sloader.hlp 1.1 Help for sloader.asm 2598 

sloader.p 1.1 Serial loader s-record file for download 736 
to EPROM 

parity.asm 1.0 Parity calculation of a 24-bit number in 1641 
accumulator A 

parity.hlp 1.0 Help for parity.asm 936 

parityt.asm 1.0 Test program for parity.asm 685 

parityt.hlp 1.0 Help for parityt-asm 259 

dspbug Ordering information for free debug 882 


monitor for DSP56000/DSP56001 


12.13.16 Current DSP56200 Related Software 


pi 1.0 Information on 56200 Filter Software 6343 
p2 1.0 Interrupt Driven Adaptive Filter Flowchart. 10916 
p3 1.0 “C” code implementation of p2 25795 
p4 1.0 Polled I/O Adaptive Filter Flowchart 10361 
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p5 1.0 “C” code implementation of p4 24806 
p6 1.1 Interrupt Driven Dual FIR Filter Flowchart. 9535 
p7 1.0 “C” code implementation of p6 28489 
ps8 1.0 Polled I/O Dual FIR Filter Flowchart 9656 
p9 1.0 “C” code implementation of p8 28525 


12.14 REFERENCE BOOKS AND MANUALS 

A list of DSP-related books is included here as an aid for the engineer who is new to the 
field of DSP. This is a partial list of DSP references intended to help the new user find 
useful information in some of the many areas of DSP applications. Many books could be 
included in several categories but are not repeated. 


12.14.1. General DSP 
ADVANCED TOPICS IN SIGNAL PROCESSING 
Jae S. Lim and Alan V. Oppenheim 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1988 


APPLICATIONS OF DIGITAL SIGNAL PROCESSING 
A. V. Oppenheim 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1978 


DISCRETE-TIME SIGNAL PROCESSING 
A. V. Oppenheim and R. W. Schafer 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1989 


DIGITAL PROCESSING OF SIGNALS THEORY AND PRACTICE 
Maurice Bellanger 
New York, NY: John Wiley and Sons, 1984 


DIGITAL SIGNAL PROCESSING 
Alan V. Oppenheim and Ronald W. Schafer 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1975 


DIGITAL SIGNAL PROCESSING: A SYSTEM DESIGN APPROACH 
David J. DeFatta, Joseph G. Lucas, and William S. Hodgkiss 
New York, NY: John Wiley and Sons, 1988 


FOUNDATIONS OF DIGITAL SIGNAL PROCESSING AND DATA ANALYSIS 
J. A. Cadzow 
New York, NY: MacMillan Publishing Company, 1987 
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HANDBOOK OF DIGITAL SIGNAL PROCESSING 
D. F. Elliott 
San Diego, CA: Academic Press, Inc., 1987 


INTRODUCTION TO DIGITAL SIGNAL PROCESSING 
John G. Proakis and Dimitris G. Manolakis 
New York, NY: Macmillan Publishing Company, 1988 


MULTIRATE DIGITAL SIGNAL PROCESSING 
R. E. Crochiere and L. R. Rabiner 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1983 


SIGNAL PROCESSING ALGORITHMS 
S. Stearns and R. Davis 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1988 


SIGNAL PROCESSING HANDBOOK 
C.H. Chen 
New York, NY: Marcel Dekker, Inc., 1988 


SIGNAL PROCESSING — THE MODERN APPROACH 
James V. Candy 
New York, NY: McGraw-Hill Company, Inc., 1988 


THEORY AND APPLICATION OF DIGITAL SIGNAL PROCESSING 
Rabiner, Lawrence R., Gold and Bernard 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1975 


12.14.2 Digital Audio and Filters 
ADAPTIVE FILTER AND EQUALIZERS 
B. Mulgrew and C. Cowan 
Higham, MA: Kluwer Academic Publishers, 1988 


ADAPTIVE SIGNAL PROCESSING 
B. Widrow and S. D. Stearns 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1985 


ART OF DIGITAL AUDIO, THE 
John Watkinson 
Stoneham. MA: Focal Press, 1988 


DESIGNING DIGITAL FILTERS 
Charles S. Williams 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1986 


DIGITAL AUDIO SIGNAL PROCESSING AN ANTHOLOGY 
John Strawn 
William Kaufmann, Inc., 1985 


12-18 ADDITIONAL SUPPORT. MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


| REFERENCE BOOKS AND MANUALS 


DIGITAL CODING OF WAVEFORMS 
N. S. Jayant and Peter Noll 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1984 


DIGITAL FILTERS: ANALYSIS AND DESIGN 
Andreas Antoniou 
New York, NY: McGraw-Hill Company, Inc., 1979 


DIGITAL FILTERS AND SIGNAL PROCESSING 
Leland B. Jackson 
Higham, MA: Kluwer Academic Publishers, 1986 


DIGITAL SIGNAL PROCESSING 
Richard A. Roberts and Clifford T. Mullis 
New York, NY: Addison-Welsey Publishing Company, Inc., 1987 


INTRODUCTION TO DIGITAL SIGNAL PROCESSING 
Roman Kuc 
New York, NY: McGraw-Hill Company, Inc., 1988 


INTRODUCTION TO ADAPTIVE FILTERS 
Simon Haykin 
New York, NY: MacMillan Publishing Company, 1984 


MUSICAL APPLICATIONS OF MICROPROCESSORS (Second Edition) 
H. Chamberlin 
Hasbrouck Heights, NJ: Hayden Book Co., 1985 


12.14.3  C Programming Language 
C: AREFERENCE MANUAL 
Samuel P. Harbison and Guy L. Steele 
Prentice-Hall Software Series, 1987. 


PROGRAMMING LANGUAGE - C 
American National Standards Institute, 
ANSI Document X3.159-1989 
American National Standards Institute, inc., 1990 


THE C PROGRAMMING LANGUAGE 
Brian W. Kernighan, and Dennis M. Ritchie 
Prentice-Hall, Inc., 1978. 


12.14.4 Controls 
ADAPTIVE CONTROL 
K. Astrom and B. Wittenmark 
New York, NY: Addison-Welsey Publishing Company, Inc., 1989 
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ADAPTIVE FILTERING PREDICTION & CONTROL 
G. Goodwin and K. Sin 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1984 


AUTOMATIC CONTROL SYSTEMS 
B. C. Kuo 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1987 


COMPUTER CONTROLLED SYSTEMS: THEORY & DESIGN 
K. Astrom and B. Wittenmark 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1984 


DIGITAL CONTROL SYSTEMS 
B. C. Kuo 
New York, NY: Holt, Reinholt, and Winston, Inc., 1980 


DIGITAL CONTROL SYSTEM ANALYSIS & DESIGN 
C. Phillips and H. Nagle 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1984 


ISSUES IN THE IMPLEMENTATION OF DIGITAL FEEDBACK COMPENSATORS 
P. Moroney 
Cambridge, MA: The MIT Press, 1983 


12.14.5 Graphics 
CGM AND CGI 
D. B. Arnold and P. R. Bono 
New York, NY: Springer-Verlag, 1988 


COMPUTER GRAPHICS (Second Edition) 
D. Hearn and M. Pauline Baker 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1986 


FUNDAMENTALS OF INTERACTIVE COMPUTER GRAPHICS 
J. D. Foley and A. Van Dam 
Reading MA: Addison-Wesley Publishing Company Inc., 1984 


GEOMETRIC MODELING 
Michael E. Morteson 
New York, NY: John Wiley and Sons, Inc. 


GKS THEORY AND PRACTICE 
P. R. Bono and |. Herman (Eds.) 
New York, NY: Springer-Verlag, 1987 


ILLUMINATION AND COLOR IN COMPUTER GENERATED IMAGERY 
Roy Hall 
New York, NY: Springer-Verlag 
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POSTSCRIPT LANGUAGE PROGRAM DESIGN 
Glenn C. Reid - Adobe Systems, Inc. 
Reading MA: Addison-Wesley Publishing Company, Inc., 1988 


MICROCOMPUTER DISPLAYS, GRAPHICS, AND ANIMATION 
Bruce A. Artwick 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1985 


PRINCIPLES OF INTERACTIVE COMPUTER GRAPHICS 
William M. Newman and Roger F. Sproull 
New York, NY: McGraw-Hill Company, Inc., 1979 


PROCEDURAL ELEMENTS FOR COMPUTER GRAPHICS 
David F. Rogers 
New York, NY: McGraw-Hill Company, Inc., 1985 


RENDERMAN INTERFACE, THE 
Pixar 
San Rafael, CA. 94901 


12.14.6 Image Processing 
DIGITAL IMAGE PROCESSING 
William K. Pratt 
New York, NY: John Wiley and Sons, 1978 


DIGITAL IMAGE PROCESSING (Second Edition) 
Rafael C. Gonzales and Paul Wintz 
Reading, MA: Addison-Wesley Publishing Company, Inc., 1977 


DIGITAL IMAGE PROCESSING TECHNIQUES 
M. P. Ekstrom 
New York, NY: Academic Press, Inc., 1984 


DIGITAL PICTURE PROCESSING 
Azriel Rosenfeld and Avinash C. Kak 
New York, NY: Academic Press, Inc., 1982 


SCIENCE OF FRACTAL IMAGES, THE 
M. F. Barnsley, R. L. Devaney, B. B. Mandelbrot, H. O. Peitgen, 
D. Saupe, and R. F. Voss 
New York, NY: Springer-Verlag 


12.14.7 Motorola DSP Manuals 
MOTOROLA DSP LINKER/LIBRARIAN REFERENCE MANUAL 
Motorola, Inc., 1992. 
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MOTOROLA DSP ASSEMBLER REFERENCE MANUAL 
Motorola, Inc., 1992. 


MOTOROLA DSP SIMULATOR REFERENCE MANUAL 
Motorola, Inc., 1992. 


MOTOROLA DSP56000/DSP56001 USER’S MANUAL 
Motorola, Inc.,1990. 


MOTOROLA DSP56100 FAMILY MANUAL 
Motorola, Inc.,1992. 


MOTOROLA DSP56156 USER’S MANUAL 
Motorola, Inc.,1992. 


MOTOROLA DSP56166 USER’S MANUAL 
Motorola, Inc.,1992. 


MOTOROLA DSP96002 USER’S MANUAL 
Motorola, Inc.,1989. 


12.14.8 Numerical Methods 
ALGORITHMS (THE CONSTRUCTION, PROOF, AND ANALYSIS OF 
PROGRAMS) 
P. Berliout and P. Bizard 
New York, NY: John Wiley and Sons, 1986 


MATRIX COMPUTATIONS 
G. H. Golub and C. F. Van Loan 
John Hopkins Press, 1983 


NUMERICAL RECIPES IN C - THE ART OF SCIENTIFIC PROGRAMMING 
William H. Press, Brian P. Flannery, 
Saul A. Teukolsky, and William T. Vetterling 
Cambridge University Press, 1988 


NUMBER THEORY IN SCIENCE AND COMMUNICATION 
Manfred R. Schroeder 
New York, NY: Springer-Verlag, 1986 


12.14.9 Pattern Recognition 
PATTERN CLASSIFICATION AND SCENE ANALYSIS 
R. O. Duda and P. E. Hart 
New York, NY: John Wiley and Sons, 1973 
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CLASSIFICATION ALGORITHMS 
Mike James 
New York, NY: Wiley-Interscience, 1985 
Spectral Analysis: 


STATISTICAL SPECTRAL ANALYSIS, A NONPROBABILISTIC THEORY 
William A. Gardner 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1988 


THE FAST FOURIER TRANSFORM AND ITS APPLICATIONS 
E. Oran Brigham 
Englewood Cliffs, NJ: Prentice-Hall, Inc., 1988 


THE FAST FOURIER TRANSFORM AND ITS APPLICATIONS 
R. N. Bracewell 
New York, NY: McGraw-Hill Company, Inc., 1986 
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APPENDIX A 


PRELIMINARY 


DSP56100 FAMILY INSTRUCTION SET 


¢ Arithmetic MAC(su,uu) . - Program 
ABS HES Control 
ADC NEGC 


NORM 
RND 
SBC 
SUB 
SUBL 
SWAP 


etn Tec DOLoop fe 
CLR24 TFR DO FOREVER ian 
cmp TFR2 ENDDO 

cmpm TST BRKcc JSce 
DEC TST2 « Move 

DEC24 aslo) LEA 


¢ Logical MOVE 
AND MOVE(C) 
ANDI MOVE(I) RTS 
EOR MOVE(M) STOP 
IMPY MOVE(P) Swi 
is WAIT 
INC24 
MAC 
MACR 
MPY 
MPYR 
MPY(su,uu) 
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| INTRODUCTION 


A.1| INTRODUCTION 


This appendix contains detailed information about each instruction in the DSP56100 family instruction set. 
An instruction guide is presented first to help in understanding the individual instruction descriptions. This 
is followed by sections on notation and addressing modes. Since the move instruction is a parallel move 
with an ALU NOP, the parallel moves are grouped with the MOVE instruction. The instructions are then de- 
scribed in alphabetical order. 


A.1.1 Instruction Guide 


The following information is included in each instruction description with the goal of making each description 
self-contained: 


Name and Mnemonic: The mnemonic is highlighted in bold type for easy reference. 


Assembler Syntax and Operation: For each instruction syntax the corresponding operation is symbolically 
described. If there are several operations indicated on a single line in the operation field, those operations 
do not necessarily occur in the order shown but are generally assumed to occur in parallel. If a parallel data 
move is allowed it will be indicated in parenthesis in both the assembler syntax and operation fields. If a 
letter in the mnemonic is optional it will be shown in parenthesis in the assembler syntax field. 


Description: A complete text description of the instruction is given together with any special cases and/or 
condition code anomalies which the user should be aware of when using that instruction. 


Example: An example of the use of the instruction is given. The example is shown in the DSP56100 assem- 
bler source code format. Most arithmetic and logical instruction examples include one or two parallel data 
moves to illustrate the many types of parallel moves that are possible. The example includes a complete 
explanation which discusses the contents of the registers referenced by the instruction (but not those refer- 
enced by the parallel moves) both before and after the execution of the instruction. Most examples are de- 
signed to be easily understood without the use of a calculator. The contents shown in registers are in hexa- 
decimal format. 


Condition Codes: The status register is depicted with the condition code bits which can be affected by the 
instruction highlighted in bold type. Not all bits in the status register are used. Those which are reserved are 
indicated with a double asterisk and are read as zeros. 


Instruction Format: The instruction fields, the instruction opcode and the instruction extension word are 
specified for each instruction syntax. When the extension word is optional it is so indicated. The values 
which can be assumed by each of the variables in the various instruction fields are shown under the instruc- 
tion fields heading. Note that the symbols used in decoding the various opcode fields of an instruction are 
completely arbitrary. Furthermore, the opcode symbols used in one instruction are completely independent 
of the opcode symbols used in a different instruction. 


Timing: The number of oscillator clock cycles required for each instruction syntax is given. This information 
provides the user a basis for comparison of the execution times of the various instructions in oscillator clock 
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cycles. Please refer to Table A-1 and the section entitled “Instruction Timing” for a complete explanation 
of instruction timing including the meaning of the symbols “aio”, “ap”, “ax”, “ax2”, “ea”, “jx”, “mv”, “mvb”, 
“mvc”, “mvm”, “mvp”, “rx”, “wio”, “wp”, and “wx”. 


Memory: The number of program memory words required for each instruction syntax is given. This informa- 
tion provides the user a basis for comparison of the number of program memory locations required for each 
of the various instructions in 16-bit program memory words. Please refer to Table A-1 and the section enti- 
tled “Instruction Timing” for a complete explanation of instruction memory requirements including the 
meaning of the symbols “ea” and “mv”. 


A.2. NOTATION 


Each instruction description contains symbols used to abbreviate certain operands and operations. Table 
A-1 lists the symbols used and their respective meanings. 


Table A-1 Instruction Description Notation 


Data ALU Registers Operands 


Xn Input register X1 or XO (16 bits) 
Yn Input register Y1 or YO (16 bits) 
An Accumulator registers A2, A1, AO (A2 - 8 bits, A1 and AO - 16 bits) 
Bn Accumulator registers B2, B1, BO (B2 - 8 bits, B1 and BO - 16 bits) 


Xx Input register X = X1:X0 (32 bits) 
Y Input register Y = Y1:Y0 (32 bits) 
A Accumulator A = A2:A1:A0 (40 bits) * 
B Accumulator B = B2:B1:BO (40 bits) * 


* Note: In data move operations, shifting and limiting is performed when this register is specified as a 
source operand. When specified as a destination operand, sign extension and possibly zero- 
ing are performed. 


A-4 MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


| NOTATION 


Table A-1 Instruction Description Notation (continued) 


Address ALU Registers Operands 


Rn 
Nn 


Address registers RO thru R83 (16 bits) 
Address offset registers NO through N3 (16 bits) 


Program Controller Registers 


PC Program counter register (16 bits) 

MR Mode register (8 bits) 

CCR Condition code register (8 bits) 

SR Status register = MR:CCR (16 bits) 

OMR Operating mode register (8 bits) 

LA Hardware loop address register (16 bits) 

LC Hardware loop counter register (16 bits) 

SP System stack pointer register (6 bits) 

SSH Upper portion of the current top of the stack (16 bits) 

SSL Lower portion of the current top of the stack (16 bits) 

SS System stack RAM = SSH:SSL (15 locations by 32 bits) 
Address Operands 

ea Effective address 

eax Effective address for X bus 

XXXX Absolute address (16 bits) 

XX Short jump address (8 bits) 

aa Absolute short address (5 bits, zero extended) 

ee 6 bit PC relative signed address 

AA 6-bit absolute signed address 

pp I/O short address (5 bits, one’s extended) 

<...> Specifies the contents of the specified address 

x: X memory reference 

P: Program memory reference 


Miscellaneous Operands 


S,Sn Source operand register 
D,Dn Destination operand register 
D[n] Bit n of D destination operand register 
#XX Immediate short data (8 bits) 
#XXXX Immediate data (16 bits) 
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Table A-1 Instruction Description Notation (continued) 


Unary Operators 
X The over bar is the negation operator 
PUSH Push specified value onto the system stack (SS) operator 
PULL Pull specified value from the system stack (SS) operator 
READ Read the top of the system stack (SS) operator 
PURGE Delete the top value on the system stack (SS) operator 
|| Absolute value operator 
Binary Operators 
+ Addition operator 
: Subtraction operator 
7 Multiplication operator 
ah Division operator 
+ Logical inclusive OR operator 
|,° Logical AND operator 
® Logical exclusive OR operator 
> “Is transferred to” operator 
: Concatenation operator 
Ss System stack RAM = SSH:SSL (15 locations by 32 bits) 
Addressing Mode Operators 
<< I/O short addressing mode force operator 
< Short addressing mode force operator 
> Long addressing mode force operator 
# Immediate addressing mode operator 
#> Immediate long addressing mode force operator 
#< Immediate short addressing mode force operator 
Mode Register (MR) Symbols 
LF Loop Flag bit indicating when a DO loop is in progress 
FV ForeVer flag bit indicating when a DOFOREVER loop is in progress 
$1,S0 Scaling Mode bits indicating the current scaling mode 
11,10 Interrupt Mask bits indicating the current interrupt priority level 
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Table A-1 Instruction Description Notation (continued) 


Condition Code Register (CCR) Symbols (standard definitions) 

Ss Sticky set during moves from accumulators to memory according to its 
definition (See Section 5.3 and A.4) 

L Limit bit indicating arithmetic overflow and/or data shifting/limiting 

E Extension bit indicating if the integer portion of A or B is in use 

U Unnormalized bit indicating if the A or B result is unnormalized 

N Negative bit indicating if bit 39 of the A or B result is set 

Z Zero bit indicating if the A or B result equals zero 

V Overflow bit indicating if arithmetic overflow has occurred in A or B 

Cc Carry bit indicating if a carry or borrow occurred in A or B result 
Instruction Timing Symbols 

aio The time required to access an I/O operand 

ap The time required to access a P memory operand 

ax The time required to access an X memory operand 

axx The time required to access X memory operands for double read 

ea The time or number of words required for an effective address calculation 

eab The time required for an effective address calculation for branch instructions 

jx The time required to execute part of a jump-type instruction 

mv The time or number of words required for a move-type operation 

mvb The time required to execute part of a bit manipulation instruction 

mvc The time required to execute part of a MOVEC instruction 

mvm The time required to execute part of a MOVEM instruction 

mvp The time required to execute part of a MOVEP instruction 

rx The time required to execute part of an RTI or RTS instruction 

wp The number of wait states used in accessing external P memory 

WX The number of wait states used in accessing external X memory 
Other Symbols 

() Optional letter, operand or operation 

(...) Any arithmetic or logical instruction which allows parallel moves 

EXT Extension register portion of an accumulator (A2 or B2) 

LS Least significant 

LSP Least significant portion of an accumulator (AO or BO) 

MS Most significant 

MSP Most significant portion of an accumulator (A1 or B1) 

r Rounding constant 

S/L Shifting and/or limiting on a Data ALU register 

Sign Ext Sign extension of a Data ALU register 

Zero Zeroing of a Data ALU register 
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A.3. ADDRESSING MODES 


The addressing modes are grouped into three categories — register direct, address register indirect and 
special. These addressing modes are summarized in Table A-2. All address calculations are performed in 
the Address ALU to minimize execution time and loop overhead. Addressing modes specify whether the 
operands are in registers, in memory or in the instruction itself (such as immediate data) and provide the 
specific address of the operands. 


The register direct addressing mode can be subclassified according to the specific register addressed. The 
data registers include X1, X0, Y1, YO, X, Y, A2, A1, AO, B2, B1, BO, A, and B. The control registers include 
SR, OMR, SP, SSH, SSL, LA, LC, CCR, and MR. 


Address register indirect modes use an address register Rn (RO-R3) to point to locations in X and P mem- 
ory. The contents of the Rn address register is the effective address of the specified operand, except in the 
“indexed by offset” mode where the effective address is (Rn+Nn). Address register indirect modes use an 
address modifier register Mn to specify the type of arithmetic to be used to generate the ea. If an addressing 
mode specifies an address offset register, the given address offset register is used to update the corre- 
sponding address register. The Rn address register may only use the corresponding address offset register 
Nn and the corresponding address modifier register Mn. For example, the address register RO may only use 
the NO address offset register and the MO address modifier register during actual address computation and 
address register update operations. This unique implementation is extremely powerful and allows the user 
to easily address a wide variety of DSP oriented data structures. All address register indirect modes use at 
least one set of address registers (Rn, Nn, and Mn), and the double X memory read uses two sets of ad- 
dress registers, one for the first X memory read and one for the second X memory read. Only R3:N3 can 
be used for this second X memory read and R3 is updated only using the linear arithmetic. 


The special addressing modes include immediate and absolute addressing modes as well as implied refer- 
ences to the program counter (PC), the system stack (SSH or SSL), and program (P) memory. 


Addressing modes may also be categorized by the ways in which they may be used. Table A-3 shows the 
various categories to which each addressing mode belongs. The following classifications will be used in the 
instruction descriptions. 
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Table A-2 DSP56100 Family Addressing Modes 


Operand Reference 


Uses Mn 

Addressing Mode Modifier $|C;D); A| P| X/|XX 
Register Direct 
Data or Control Register No X| X] X 
Address Register Rn No X 
Address Modifier Register Mn No X 
Address Offset Register Nn No X 
Address Register Indirect 
No Update No X| X 
Postincrement by 1 Yes* X| X] X 
Postdecrement by 1 Yes X| X 
Postincrement by Offset Nn Yes* X| X] X 
Indexed by Offset Nn Yes X 
Predecrement by 1 Yes X 
PC Relative 
Long Displacement No xX 
Short Displacement No Xx X 
Address Register No xX X 
Special 
Upper word of accumulator No X 
Immediate Data No X 
Immediate Short Data No X 
Absolute Address No X| X 
Absolute Short Address No X| X 
Short Jump Address No X 
I/O Short Address No X 
Implicit No X |X X 
Indexed by short displacement No X 


Where: 


S = System Stack Reference 
P = Program Memory Reference 


C =Program Controller Register Reference 


X = X Memory Reference 


D = Data ALU Register Reference 


XX = Double X Memory Read 


A = Address ALU Register Reference 


*note: M3 is not used for updating R3 in the second read in the X memory 
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Table A-3 DSP56100 Family Addressing Mode Encoding 


Addressing 
Categories 

Assembler 
Addressing Mode U;}P)M|A Syntax 
Register Direct 
Data or Control Register X (Table A-1) 
Address Register X Rn 
Address Offset Register x Nn 
Address Modifier Register Xx Mn 
Address Register Indirect 
No Update X| X (Rn) 
Postincrement by 1 X| X} X] X (Rn)+ 
Postdecrement by 1 X| X (Rn)- 
Postincrement by Offset Nn X|X]} xX] xX (Rn)+Nn 
Indexed by Offset Nn X| X (Rn+Nn) 
Predecrement by 1 X| X -(Rn) 
Special 
Upper word of accumulator X| X (A1) or (B1) 
Immediate Data xX #XXXX 
Absolute Address xX | X XXXX 
Immediate Short Data #Xx 
Short Jump/Branch Address Xx AA or ee 
Absolute Short Address X aa 
I/O Short Address X pp 
Implicit X 
Indexed by short displacement xX | X R2+xx 


Where: 


Update Mode (U) The Update Addressing mode is used to modify registers without 


any associated data move 


Parallel Mode (P) The Parallel Addressing mode is used in instructions where two 
effective addresses are required 


Memory Mode (M) The Memory Addressing mode is used to refer to operands in 


memory using an effective addressing field 


Alterable Mode (A) The Alterable Addressing mode is used to refer to alterable or writ; 


The address register indirect addressing modes require that the offset register number be the same as the 
address register number. The assembler syntax “Nn” supports this feature. The assembler syntax “N” may 
be used instead of “Nn” in the address register indirect memory addressing modes. If “N” is specified, the 
offset register number is the same as the address register number. 
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A.3.1 Addressing Mode Modifiers 


The addressing mode selected in the instruction word is further specified by the contents of the address 
modifier register Mn. The addressing mode update modifiers (MO-M3) are shown in Table A-4. There are 
no restrictions on the use of modifier types with any address register indirect addressing mode. 


Table A-4 Addressing Mode Modifier Summary 


16-bit Modifier Reg. (MO-M3) 

MMMMMMMMMMMMMMMM Address Calculation Arithmetic 
0000000000000000 Reverse Carry (Bit Reversed) 
0000000000000001 Modulo 2 
0000000000000010 Modulo 3 
0111111111111110 Modulo 32767 
0111111111111111 Modulo 32768 — 
1000000000000000 Reserved 
1111111111111110 Reserved 
1111111111111111 Linear (Modulo 65536) 

where MMMMMMMMMMMMMMMM = 16-bit Modifier Register Contents 
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A.4_ CONDITION CODE COMPUTATION 


The condition code portion of the status register consists of 8 defined bits: 


C — Carry 

V — Overflow 
Z— Zero 

N — Negative 


U — Unnormalized 
E — Extension 

L — Limit 

S — Sticky 


The C,V,Z,N,U,E, and S bits are true condition code bits that reflect the condition of the result of a data ALU 
operation. These condition code bits are not affected by address ALU calculations or by data transfers (ex- 
cept for the S and L bits) over the XDB, GDB data buses. The L bit is a latching overflow bit which indicates 
that an overflow has occurred in the Data ALU or that limiting has occurred when reading a Data ALU reg- 
ister. This limiting occurs as the result of a data bus move operation with limiting accumulator data through 
the data shifter/limiters. The S bit is a latching bit useful in implementing block floating point FFT algorithms. 
When a move to X memory from an accumulator is made, the S bit is set to indicate that scaling should be 
implemented on the next FFT pass. 


The standard definition of the condition codes is given below. Exceptions to these are given in Table A-5. 


C (Carry) Set if a carry is generated out of the most significant bit of the result for an addition. 
Also set if borrow is generated in a subtraction. The carry or borrow is generated out 
of bit 39 of the result. Clear otherwise. 


V (Overflow) Set if an arithmetic overflow occurs in the 40 bit result. This indicates that the result 
is not representable in the accumulator register and the accumulator register has 
overflowed. Cleared otherwise. In Saturation Mode, an arithmetic overflow occurs 
in the 32 bit result. This indicates that the result is not representable in the accumu- 
lator register without the extension part. The accumulator register has overflowed. 
Cleared otherwise. 


Z (Zero) Set if the result equals zero. Cleared otherwise. 
N (Negative) Set if the most significant bit, bit 39, of the result is set. Cleared otherwise. 


U (Unnormalized) Set if the two most significant bits of the MSP portion of the result are the same. 
Cleared otherwise. The MSP portion is defined by the scaling mode and the U bit is 
computed as follows: 


Scaling Mode U bit Computation 


No scaling (Bit 31@ Bit 30) 


U = 
Scale down U = (Bit 32 © Bit 31) 
U = 


Scale up (Bit 30 © Bit 29) 
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E (Extension) Cleared if all the bits of the integer portion of the result are the same; that is, the bit 
patterns 00...00 or 11...11. Set otherwise. The integer portion is defined by the scal- 
ing mode and the E bit is computed as follows: 


| st | so Scaling Mode Integer Portion 


0 0 No scaling Bits 39,38,...,32,31 
0 1 Scale down Bits 39,38,...,32 
1 0 Scale up Bits 39,38,...,32,31,30 


If E is cleared, then the low-order fractional portion contains all the significant bits and 
the high order integer portion is sign extended. In this case, the accumulator exten- 
sion register can be ignored. This flag is meaningless if saturation has occurred (the 
saturation flag is set, SAT=1). 


L (Limit) Set if the overflow bit V is set. Also set if the data shifter/limiters perform a limiting 
operation. In Saturation Mode, the L limit is set by the saturation of the 32 bit result. 
Not affected otherwise. The L bit is latched once it is set. The L bit is cleared only by 
the processor reset or an instruction that explicitly clears it. The L bit is affected by 
data movement operations which read the accumulator registers. 


S (Sticky) Set on moves of accumulators to X memory. This can happen when using a MOVE 
instruction or in a parallel move. The S bit is computed according to scaling modes 
as follows: 


Scaling Mode Integer Portion 


No scaling S=Bit 30 © Bit 29 


Scale down S=Bit 31 @ Bit 30 
Scale up S=Bit 29 © Bit28 


Note: The S bit is a “sticky” bit in the status register. It is cleared only 
during reset, ANDI operation, or a move to the status register. 
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Figure A-1 details how each instruction affects the condition codes. The convention for the notation that is 
used in the condition code register representation is: 


* set according to the standard definition by the result of the operation 

— not affected by the operation 

0 cleared 

1 set 

U undefined, meaningless 

? set according to the special computation definition by the result of the operation. 
Note that the condition code computation shown in Table A-5 may differ from that defined in the opcode 
descriptions. This indicates that the standard definition may be used to generate the specific condition code 
result. For example, the Z flag computation for the CLR instruction is shown below as the standard definition 


while the opcode description indicates that the Z flag is always set. Table A-5 gives the chip implementation 
viewpoint while the opcode description gives the user viewpoint. 
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Table A-5 Condition Code Computations 
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Instruction |S |} L/E/;}U/N| Z| Vic Notes Instruction |S | L/E/;}U;/;N| Z| Vic Notes 
ABS a Oa a LSL *}* }—/}—) 2] ?] 0] 2] 9,10,11 
ADC —|*yryeyett rt rye LSR *!*}—/}—]?]?] 0] 2] 9,10,12 
ADD a ee ee oe ee LEA 
AND *}* |—/}—]?]?}]0]—) 9,10 MAC oe ee 
ANDI —|?/}?/27)?) 2?) 2?) 2] 2 MACxx —|*7 fey epetp ep eye 
MACR w|i (Nitec elf! ee. |Ietes!| tages dP ell 
ASL By RR CR RL a eae! Ag 
ASL4 —|?7}* }* ]*]*] 2] 2] 15,16 MOVE a es 
ASR eye pe pepe ye) oO] 2] 4 MOVE(C) *~|/2}2?/2)2)] 2?) 2?) 27) 14 
ASR4 —}*}r ye ye] *} of 2?) 17 MOVE(I) 
ASR16 —;*}*}* ye] *} oy 2?) 18 MOVE(M) A | oe 
MOVE(P) = | 4 
BFCHG * 2?) 5 MOVE(S) a ss 
BFCLR 2?) 6 MPY cial Picea ct 
BFSET * 2?) 5 MPYxx ee ( 
BFTSTH * 2?) 5 MPYR i a a Oe 
BFTSTL : 2?) 6 
NEG ewe |] a] ae eee) 
Bcc NEGC — * * * * * * * 
BRA NOP 
BRKcc NORM —|*7}ry eye] e] rpy—y] 4 
BScc NOT *|* }—|/—]?}]?] 0] —] 9,10 
BSR 
OR */* }/—}—]?] 2] 0/]/—] 9,10 
CHKAAU —|—|]—|—|?] 7] ?]—| 21,22,23 ORI =| 27/2?) 2)]/9] 217) 2] 7 
CLR eye pepe pep et gye REP : 
CLR24 *]e tet * te) 2) Oo] —] 19 REPcc 
CMP we | ae | ae) ee | ey ca dae RESET 
CMPM sel spe |g eco || See ae) yee Se 
RND Best ce he Wa igs ae ll cats Uh seed] 
DEC s foe pe ye fe por poe fs ROL *}* |}—/}—) 2] 2] 0] 2] 9,10,11 
DEC24 Be Rg | |B) I IG ROR *!* |—/}—]?]?] 0] 2] 9,10,12 
DIV ‘ 2/2] 1,8 
DMAC —}r free pry rp ep RTI —|?}?/27)2?] 2?) 2?) 21 13 
RTS 
DO ‘ SBC Pa eee 
DOFOREVER STOP 
DEBUG 
DEBUGcc SUB ae acl cl (dT 
ENDDO SUBL * * * * * * ? * 1 
EOR *|* |—/|—) 2] ?]0]—] 9,10 SWAP 
Swi 
EXT ee ee ee eae eo 
ILLEGAL Tec 
IMAC —|*}?/?)*] 2} 2?)—] 19,25,26 TFR 
IMPY —|*}?/?)*] 2?) 2?;)—|] 19,25,26 TFR2 : 
INC PY ast || ee |e oe | sell TFR3 x | * 
INC24 BRS Ra a pele OK | ag TST o;}*}*]*]*}*}o7o 
TST2 —|/*}*}*7*]*}] Oo} 07} 24 
Jcc 
JMP WAIT 
ee ZERO sal] sel os | see! I al] x 
JSR 
Note 1 V — Set if an arithmetic overflow occurs in the 40 bit result. Also set if the most significant 
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bit of the destination operand is changed as a result of the left shift. Cleared otherwise. 


Note 2 All? bits — Cleared if the corresponding bit in the immediate data is cleared and if the op- 
erand is the CCR. Not affected otherwise. 

Note 3 C — Set if bit 39 of source operand is set. Cleared otherwise. 

Note 4 C — Set if bit 0 of source operand is set. Cleared otherwise. 

Note 5 C — Set if all bits specified by the mask are set. Cleared otherwise. Ignore bits which are 
not set in the mask. 

Note 6 C — Set if all bits specified by the mask are cleared. Cleared otherwise. Ignore bits which 
are not set in the mask. 

Note 7 All? bits — Set if the corresponding bit in the immediate data is set and if the operand is the 
CCR. Not affected otherwise. 

Note 8 C — Set if bit 39 of the result is cleared. Cleared otherwise. 

Note 9 N — Set if bit 31 of the result is set. Cleared otherwise. 

Note 10 Z — Set if bits 16-31 of the result are zero. Cleared otherwise. 

Note 11 C — Set if bit 31 of the source operand is set. Cleared otherwise. 

Note 12 C — Set if bit 16 of the source operand is set. Cleared otherwise. 

Note 13 All? bits — Set according to value pulled from the stack. 

Note 14 All? bits — If SR is specified as a destination operand, set according to the corresponding 


bit of the source operand. If SR is not specified as a destination operand, L is set if data 
limiting occurred. All? bits are not affected otherwise. 


Note 15 V — Set if an arithmetic overflow occurs in the 40 bit result. Also set if bit 5 through 39 are 
not the same. 

Note 16 C — Set if bit 36 of source operand is set. Cleared otherwise. 

Note 17 C — Set if bit 3 of source operand is set. Cleared otherwise. 

Note 18 C — Set if bit 15 of source operand is set. Cleared otherwise. 

Note 19 Z — Set if the 24 most significant bits of the destination result are all zeroes. 

Note 20 In Saturation mode, only bits 31-32 of the result are examined for saturation. 

Note 21 V — Set if the result of the last address ALU update performed a modulo wrap. Cleared if 
the result of the last address ALU did not perform a modulo wrap. 

Note 22 Z — Set if the result of the last address ALU update is 0. Cleared if the result of the last 
address ALU is positive. 

Note 23 N — Set if the result of the last address ALU update is negative. Cleared if the result of the 
last address ALU is positive. 

Note 24 (L,E,U should be set to 0) 

Note 25 U,E — Will not be set correctly by this instruction 

Note 26 V — Set to zero regardless of the overflow 
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| DESCRIPTIONS 


A.5 DESCRIPTIONS 


The following section describes each instruction in the DSP56100 family instruction set in complete detail. 
The format of each instruction description is given in the Instruction Guide at the beginning of Appendix A. 
Instructions which allow parallel moves include the notation “(parallel move)” in both the Assembler Syntax 
and the Operation fields. The example given with each instruction discusses the contents of all the registers 
and memory locations referenced by the opcode — operand portion of that instruction though not those ref- 
erenced by the parallel move portion of that instruction. Please refer to the “Parallel Move Descriptions” 
which follow the MOVE instruction description for a complete discussion of parallel moves including exam- 
ples which discuss the contents of all the registers and memory locations referenced by the parallel move 
portion of an instruction. 


Whenever an instruction uses an accumulator as both a destination operand for a Data ALU operation and 
as a source for a parallel move operation, the parallel move operation will use the value in the accumulator 
prior to execution of any Data ALU operation. 


Whenever a bit in the Condition Code Register is defined according to the standard definition as given in 
Section A.4 entitled “Condition Code Computation”, a brief definition will be given in normal text in the 
Condition Code section of that instruction description. Whenever a bit in the Condition Code Register is de- 
fined according to a special definition for some particular instruction, the complete special definition of that 
bit will be given in the Condition Code section of that instruction in bold text to alert the user to any special 
conditions concerning its use. 


The definition and thus the computation of both the E (Extension) and U (Unnormalized) bits of the Condition 
Code Register (CCR) varies according to the scaling mode being used. Please refer to the section entitled 
“Condition Code Computation” for complete details. 


Note: The signed integer portion of an accumulator is not necessarily the same as either the A2 or B2 ex- 
tension register portion of that accumulator. The signed integer portion of an accumulator is defined accord- 
ing to the scaling mode being used and can consist of the most significant 8,9 or 10 bits of an accumulator. 
Please refer to the “Condition Code Computation” section for complete details. 
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ABS Absolute Value ABS 


Operation: Assembler Syntax: 


[ID] 7 D (parallel move) ABS D (parallel move) 


Description: Take the absolute value of the destination operand D and store the result in the destination 
accumulator. 


Example: 
ABS A X:(RO)+,X1 take ABS. value, move data into X1, update RO 
A Before Execution A After Execution 
FFFF FFF2 | 00 | 0000 000E 
A2 Al AO A2 Al AO 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value 
$FF:FFFF:FFF2. Since this is a negative number, the execution of the ABS instruction 
takes the two’s complement of that value and returns $00:0000:000E. 


Note: For the case in which the D operand equals $80:0000:0000 (-256.0), the ABS instruction will cause 
an overflow to occur since the result cannot be correctly expressed using the standard 40-bit, fixed 
point, two’s complement data representation. Data limiting does not occur i.e., A is not set to the 
limiting value of $7F:FFFF:FFFF but remains unchanged. 


Condition Codes Affected: 


kk MR > ccR ——————+ 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


LF}; *; * | *|S1/S0/ 11/10; S;L)/E;)U)N)Z) VC 


— Computed according to the standard definition (see section A.4) 
— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of A or B result is in use 

Set according to the standard definition of the U bit 

— Set if bit 39 of A or B result is set 

— Set if A or B result equals zero 

— Set if overflow has occurred in A or B result 


<NZCmro 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 
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Freescale Semiconductor tne ——_ 


ABS Absolute Value ABS 


Instruction Format: 


ABS D (parallel move) 
Opcode: 


15 12 11 8 7 4 3 0 


1 m R RJH H H WI}O 1 47 1T)F 0 0 1 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for de- 
tails on the m, RR, HHH, and W data fields. 


A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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ADC Add Long with Carry ADC 


Operation: Assembler Syntax: 


$+C+D-— _ OD _ (no parallel move) ADC §,D (no parallel move) 


Description: Add the source operand S and the carry bit C of the condition code register to the destina- 
tion operand D and store the result in the destination accumulator. Long words (32 bits) 
may be added to the (40-bit) destination accumulator. 


Note: The carry bit is set correctly for multiple precision arithmetic using long word operands if the exten- 
sion register of the destination accumulator (A2 or B2) is the sign extension of bit 31 of the destina- 
tion accumulator (A or B). 


Example: 
; 64 bit addition: Y1:Y0:X1:X0 + B2:B1:B0:A1:A0 = B2:B1:A1:A0 
ADD XA ;add 32-bit LS words; 
ADC Y,B ;add 32-bit MS words with carry 
0000 0001 8000 0000 
Y1 YO x1 X0 
(Y1:Y0 not affected by the operation) (X1:X0 not affected by the operation) 
B Before Execution A Before Execution 
00 0000 0001 FF 8000 0000 
B2 B1 BO A2 Al AO 
B After Execution A After Execution 
00 0000 0003 FF 0000 0000 
B2 B1 BO A2 Al AO 


Explanation of Example: This example illustrates long word double precision (64-bit) addition using the 
ADC instruction. Prior to execution of the ADD and ADC instructions, the 64-bit value 
$0000:0001 :8000:0000 is loaded into the Y and X registers (Y:X), respectively. The other 
double precision 64-bit value $0000:0001 :8000:0000 is loaded into the B and A accumula- 
tors (B:A), respectively. Since the 32-bit value loaded into the A accumulator is automati- 
cally sign extended to 40 bits and the other 32-bit long word operand is internally sign ex- 
tended to 40 bits during instruction execution, the carry bit will be set correctly after the ex- 
ecution of the ADD X,A instruction. The ADC Y,B instruction then produces the correct MS 
40-bit result. The actual 64-bit result is stored in B1:B0:A1:A0. 
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N(Linetee 


ADC Add Long with Carry ADC 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF; *; * | *|S1/S0/ 11/10; S;L]E;)U)N) Z) vic 


— Set if the signed integer portion of A or B result is in use 
— Set according to the standard definition of the U bit 

— Set if bit 39 of Aor B result is set 

Set if A or B result is zero. Cleared otherwise 

— Set if overflow has occurred in A or B result 

— Set if acarry (or borrow) occurs from bit 39 of A or B result 


O<N2Z2C™M 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: 
ADC S,D 
Opcode: 


Instruction Fields: 


Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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ADD Add ADD 


Operation: Assembler Syntax: 


$+D> D (parallel move) ADD §,D (parallel move) 


Description: Add the source operand S to the destination operand D and store the result in the destina- 
tion accumulator. Words (16 bits), long words (32 bits) and accumulators (40 bits) may be 
added to the destination accumulator. 


Note: The carry bit is set correctly using word or long word source operands if the extension register of 
the destination accumulator (A2 or B2) is the sign extension of bit 31 of the destination accumulator 
(A or B). The carry bit is always set correctly using accumulator source operands. 


Example: 
ADD X0,A X:(RO)+,X0 X:(R3)+,X1 :16-bit add, update X1,X0,RO,R3 
ADD X0,A A,X:(R1)+ 16-bit add, save accumulator 
Before Last Execution After Last Execution 
00 0100 0000 00 OOFF 0000 
A2 Al AO A2 Al AO 
FFFF FFFF 
X0 X0 


Explanation of Example: Prior to execution, the16-bit XO register contains the value $FFFF and the 40- 
bit A accumulator contains the value $00:0100:0000. The ADD instruction automatically ap- 
pends the 16-bit value in the XO register with 16 LS zeros, sign extends the resulting 32-bit 
long word to 40 bits and adds the result to the 40- bit A accumulator. Thus, 16-bit operands 
are added to the MSP portion of A or B (A1 or B1) because all arithmetic instructions as- 
sume a fractional, two’s complement data representation. Note that 16-bit operands can be 
added to the LSP portion of A or B (AO or BO) by loading the 16-bit operand into XO or YO, 
forming a 32-bit word by loading X1 or Y1 with the sign extension of XO or YO and executing 
an ADD X,A or ADD Y.,A instruction. 
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NP 


SN, 


ADD 


Condition Codes Affected: 


O<NZCMCM 


Note: 


MR 


Add 


I< 


15 14 13 12 1110 9 8 


LF 


* * * 


$1} SO 


1 


lo} S$} L} EE} U 


Z/V 


Cc 


Computed according to the standard definition (see section A.4) 
Set if limiting (parallel move) or overflow has occurred in result 
Set if the signed integer portion of A or B result is in use 
Set according to the standard definition of the U bit 
Set if bit 39 of A or B result is set 

Set if A or B result equals zero 

Set if overflow has occurred in A or B result 
Set if a carry (or borrow) occurs from bit 39 of A or B result 


Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: 


Opcode: 


Instruction Fields: 


ADD S,D (parallel move) 
15 12 11 8 7 4 3 0 
1 m R RjJH H H W/O 0 0 O|F J J J 
15 12 11 8 7 4 3 0 
O 11%4m/jm K K KIO r r ujF uuu 


ADD 


The definition of the E and U bits varies according to the scaling mode being used. Please refer to 


Please see the “X Memory Data Move’ description in the parallel move section for 


details on the m, RR, HHH, and W data fields. See the “Dual X Memory Read” de- 
scription in the parallel move section for details on the mm, KKK, and rr data fields. 


one parallel operation 


two parallel reads 


S,D JJJ F/S,D JJJ F S,D uuuu F |S,D uuuu F 
BA |000 0/]xX0,B }100 1 X0A/0000 0/Y1,B/0011 1 
AB |000 11] Y0,A |101 0 X0O,B };0000 1 
XA |010 0/Y0,.B)/101 1 YOA }/0001 0/BA 1100 0O 
XB |010 1 |X1,A/110 0 YOB };0001 1 {A,B 1100 1 
YA |011 0/X1,B/110 1 X1,A/0010 0 
YB {011 1 |Y1A/;111 0 X1,B }/0010 1 
X0A/100 0/Y1,B )/111 1 Y1,A/0011 0 

Timing: 2 + mv oscillator clock cycles 

Memory: 1 program word 
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AND Logical AND AND 


Operation: Assembler Syntax: 
Se D[31:16] > D[381:16] (parallel move) AND §S,D (parallel move) 


where « denotes the logical AND operator 


Description: Logically AND the source operand S with bits 31-16 of the destination operand D and store 
the result in bits 31-16 of the destination accumulator. This instruction is a 16-bit operation. 
The remaining bits of the destination operand D are not affected. 


Example: 
AND X0,A (R2)-N2 ;AND XO with A1, update R2 using N2 
Before Execution After Execution 
00 1234 5678 00 1200 5678 
A2 Al AO A2 Al AO 
FFOO FFOO 
X0 X0 


Explanation of Example: Prior to execution, the 16-bit XO register contains the value $FFOO and the 40- 
bit A accumulator contains the value $00:1234:5678. The AND X0O,A instruction logically 
AND’s the 16-bit value in the XO register with bits 31-16 of the A accumulator (A1) and 
stores the 40-bit result in the A accumulator. 


Condition Codes Affected: 


MR CCR 
15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


LF}; *; * | */S1/S0/ 11/10; S;L)/E;U;)N) 2) VC 


S — Computed according to the standard definition (see section A.4) 
L  — Set if data limiting has occurred during parallel move 
N — Set if bit 31 of A or B result is set 
Z — Set if bits 31-16 of A or B result are zero 
V — Always cleared 
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SN 


AND Logical AND AND 


Instruction Format: 


AND S,D (parallel move) 
Opcode: 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


$s, {JJ 
X0,A | 00 
X0,B | 00 
YO,A | 0 
YO,B | 0 
1 
1 
1 
1 


X1,A 
X1,B 
Y1,A 
Y1,B 


-"or-'o-?O0O+O/T 


Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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ANDI AND Immediate ANDI 


Operation: Assembler Syntax: 


#xx*>D—> D (no parallel move) AND(l) = #xx,D 


where « denotes the logical AND operator 


Description: Logically AND the 8-bit immediate operand (#xx) with the contents of the destination control 
register D and store the result in the destination control register. The condition codes are 
affected only when the condition code register (CCR) is specified as the destination oper- 
and. 


Restrictions: The ANDI #xx,MR instruction cannot be used immediately before an ENDDO or RTI in- 
struction and cannot be one of the last three instructions in a DO loop (at LA-2, LA-1 or LA). 


The ANDI #xx,CCR instruction cannot be used immediately before an RTI instruction. 


Example: 


AND #$FE,CCR ;clear carry bit C in cond. code register 


SR Before Execution SR After Execution 
XX31 Xx30 
MR:CCR MR:CCR 


Explanation of Example: Prior to execution, the 8-bit condition code register (CCR) contains the value 
$31. The AND #$FE,CCR instruction logically AND’s the immediate 8-bit value $FE with 
the contents of the condition code register and stores the result in the condition code reg- 
ister. 
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| 
ANDI AND Immediate ANDI 


Condition Codes Affected: 


MR ———>_CCR ——————> 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


LF}; *;} *| */S1/S0/ 11/10; S;/L)/E;)U)N)Z) vic 


For CCR operand: 
S — Cleared if bit 7 of the immediate operand is cleared 
L — Cleared if bit 6 of the immediate operand is cleared 
E — Cleared if bit 5 of the immediate operand is cleared 
U — Cleared if bit 4 of the immediate operand is cleared 
N — Cleared if bit 3 of the immediate operand is cleared 
Z — Cleared if bit 2 of the immediate operand is cleared 
V — Cleared if bit 1 of the immediate operand is cleared 
C — Cleared if bit 0 of the immediate operand is cleared 


For MR and OMR operands: 
The condition codes are not affected using these operands 


Instruction Format: 


AND(I) #xx,D 
Opcode: 
15 12 11 8 7 4 3 0 
0 00%1]1 E E OFyi tf it tfi it i ft 
Instruction Fields:: #xx = 8-bit Immediate Short Data —iiiiiiii 
D EE 
MR 01 
CCR /|11 
OMR | 10 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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ASL Arithmetic Shift Accumulator Left ASL 


Assembler Syntax: 


ASL D (parallel move) 
Operation: 
c<—| <— <— << |e —0O (parallel move) 
D2 D1 DO 


Description: Arithmetically shift the destination operand D one bit to the left and store the result in the 
destination accumulator. The MS bit of D prior to instruction execution is shifted into the car- 
ry bit C and a zero is shifted into the LS bit of the destination accumulator D. 


Example: 
ASL A (R3)- smultiply A by 2, update R3 
Before Execution After Execution 
A5 0123 0123 4A 0246 0246 
A2 Al AO A2 Al AO 
0300 0373 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $A5:0123:0123. 
Execution of the ASL A instruction shifts the 40-bit value in the A accumulator one bit to the 
left and stores the result back in the A accumulator. The C bit of CCR (bit 0) is set by the 
operation because bit 39 of A was set prior to the instruction execution. The V bit of CCR 
(bit 1) is also set because bit 39 of A has changed during the instruction execution. The U 
bit of CCR (bit 4) is set because the result is unnormalized, the E bit of CCR (bit 5) is set 
because the signed integer portion of the result is in use, and the L bit of CCR (bit 6) is also 
set because an overflow has occurred. 
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N(Leinete 


ASL Arithmetic Shift Accumulator Left ASL 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 =O 


LF}; *|} * | *|S1/S0/ 11/10; S;/L)/E;)U)N)Z)Vvic 


— Computed according to the standard definition (see section A.4) 
— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of A or B result is in use 

— Set according to the standard definition of the U bit 

Set if bit 39 of A or B result is set 

— Set if Aor B result equals zero 

— Set if bit 39 of Aor B result is changed due to left shift 

— Set if bit 39 of A or B was set prior to instruction execution 


O<NZCMmrOD 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: 
ASL D (parallel move) 
Opcode: 


1 m R RjJH H H W/O 0 1 1)/F 0 0 1 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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A, 


AS L4 4-bit Arithmetic Shift Accumulator Left AS L4 


Assembler Syntax: 


ASL4 D (no parallel move) 
Operation: 
36 
Cd <— ——— ——— <——0 
D2 D1 DO 


Description: Arithmetically shift the destination operand D four bits to the left and store the result in the 
destination accumulator. Bit 36 of D (bit 4 of D2) prior to instruction execution is shifted into 
the carry bit C and zeros are shifted into the four LS bits of the destination accumulator D. 


Example: 
ASL4 A ;scaled four times to the left 
Before Execution After Execution 
B5 0123 0123 50 1230 1230 
A2 Al AO A2 Al AO 
0300 0373 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $B5:0123:0123. 
Execution of the ASL4 A instruction shifts the 40-bit value in the A accumulator four bits to 
the left and stores the result ($50:1230:1230) back in the A accumulator.The C bit of CCR 
(bit 0) is set by the operation because bit 36 of A was set prior to the instruction execution. 
The V bit of CCR (bit 1) is also set because bit 39 of A has changed during the instruction 
execution. The U bit of CCR (bit 4) is set because bit 31 and 30 of the result are equal, the 
E bit of CCR (bit 5) is set because the signed integer portion of the result is in use, and the 
L bit of CCR (bit 6) is also set because an overflow has occurred. 


Warning: The saturation mode is ALWAYS disabled during execution of ASL4, even when the satu- 
ration bit (SA) of the OMR is set. 
ASL4 A (or B) can be followed by a MOVE A,A (or B,B) for proper operation when the sat- 
uration mode is turned on. However, the “V” bit of the status register will never be set by 
the saturation of the accumulator during the MOVE A,A (of B,B). Only the “L” bit will then 
be set. If the “V” bit needs to be tested by the user program, ASL4 has to be substituted by 
a repetition of four ASLs. 
Refer to Sections 5.3 and 5.8 for more details. 
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| 
AS L4 4-bit Arithmetic Shift Accumulator Left AS L4 


Condition Codes Affected: 


I< MR > CCR >| 
15 141312 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *;} * | *|S1/S0)/ 11/10; S;/L)/E;)U)N)Z) vic 


— Set if overflow has occurred in result 

— Set if the signed integer portion of A or B result is in use 

— Set according to the standard definition of the U bit 

— Set if bit 39 of Aor B result is set 

— Set if Aor B result equals zero 

— Set if bit 35 through 39 of A or B are not the same before the shift 
— Set if bit 36 of A or B was set prior to instruction execution 


QO<N2ZCMPF- 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: 
ASL4 D 
Opcode: 


15 12 11 8 7 4 3 0 


000 %14;/0 10 1/0 0 1 4)/F 0 0 1 


Instruction Fields: 


D F 
A 0 
B 1 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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ASR Arithmetic Shift Accumulator Right ASR 
Assembler Syntax: 

ASR D (parallel move) 
Operation: 


- = (art! moe 
D2 D1 DO 


Description: Arithmetically shift the destination operand D one bit to the right and store the result in the 
destination accumulator. The LS bit of D prior to instruction execution is shifted into the car- 
ry bit C and the MS bit of D is held constant. 


Example: 
ASR B X:-(R3),R3 ;divide B by 2 (unless B is -1), update R38, load R3 
Before Execution After Execution 
A8 A864 A865 D4 5432 5432 
B2 B1 BO B2 B1 BO 
0300 0329 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit B accumulator contains the value 
$A8:A864:A865. Execution of the ASR B instruction shifts the 40-bit value in the B accu- 
mulator one bit to the right and stores the result back in the B accumulator. The C bit of 
CCR (bit 0) is set by the operation because bit 0 of A was set prior to the instruction exe- 
cution. The N bit of CCR (bit 3) is also set because bit 39 of the result in A is set. The E bit 
of CCR (bit 5) is set because the signed integer portion of B is used by the result. 
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Nine 


ASR Arithmetic Shift Accumulator Right ASR 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 =O 


LF}; *;} * | *|S1/S0/ 11/10; S;/L)/E;)U)N)Z)vic 


— Computed according to the standard definition (see section A.4) 
— Set if data limiting has occurred during parallel move 

— Set if the signed integer portion of A or B result is in use 

— Set according to the standard definition of the U bit 

Set if bit 39 of A or B result is set 

— Set if Aor B result equals zero 

— Always cleared 

— Set if bit 0 of A or B was set prior to instruction execution 


O<NZCMrO@ 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: 


ASR D (parallel move) 
Opcode: 


1 m R RJH H H W]}O0 0 1 14)/F 0 0 0 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program words 
piled Sy For More Information On This Product eS 


Go to: www.freescale.com 


Nine 


ASR4 ss a-pit Arithmetic Shift Accumulator Right  ASR4 


Assembler Syntax: 


ASR4 D (no parallel move) 
Operation: 


D2 D1 DO 
Description: Arithmetically shift the destination operand D four bits to the right and store the result in the 


destination accumulator. Bit 3 of D prior to instruction execution is shifted into the carry bit 
C and the 4 MS bits of D are set to the MSB of D prior to instruction execution. 


| ‘ 


Example: 
ASR4 B 
Before Execution After Execution 
A8& A864 A86C FA 8A86 4A86 
B2 B1 BO B2 B1 BO 
0300 0329 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit B accumulator contains the value 
$A8:A864:A86C. Execution of the ASR4 B instruction shifts the 40-bit value in the B accu- 
mulator four bit to the right and stores the result back in the B accumulator. The C bit of 
CCR (bit 0) is set by the operation because bit 3 of B was set prior to the instruction exe- 
cution. The N bit of CCR (bit 3) is also set because bit 39 of the result in B is set. The E bit 
of CCR (bit 5) is set because the signed integer portion of B is used by the result. 
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ASR4 ss a-pit Arithmetic Shift Accumulator Right  ASR4 


Condition Codes Affected: 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


LF}; *;} * | *|S1/S0/ 11/10; S};L]E;)U)N) Z) vic 


— Set if the signed integer portion of A or B result is in use 
— Set according to the standard definition of the U bit 

— Set if bit 39 of Aor B result is set 

Set if A or B result equals zero 


— Always cleared 
— Set if bit 3 of A or B was set prior to instruction execution 


O<N2Z2C™M 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: 
ASR4 D 
Opcode: 


15 12 11 8 7 4 3 0 


0 00 1/0 1031/0 0 1 1;/;F 0 0 0 


Instruction Fields: 


D F 
A 0 
B 1 
Timing: 2 oscillator clock cycles 
Memory: 1 program words 
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ASR1 6 16-bit Arithmetic Shift Accumulator Right ASR1 6 


Assembler Syntax: 


ASR16 D (no parallel move) 
Operation: 


15 | 
- = = (nope move 
D2 D1 DO 


Description: Arithmetically shift the destination operand D 16 bits to the right and store the result in the 
destination accumulator. The MS bit of DO (bit 15 of D), prior to instruction execution, is 
shifted into the carry bit C and the MS bits of D are signed extended. 


Example: 
ASR16 A 
Before Execution After Execution 
A8& A864 A864 FF FFA8 A864 
A2 Al AO A2 Al AO 
0000 0019 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value 
$A8:A864:A864. Execution of the ASR16 A instruction shifts the 40-bit value in the A accu- 
mulator 16 bits to the right and stores the result back in the A accumulator. The C bit of 
CCR (bit 0) is set by the operation because bit 15 of A was set prior to the instruction exe- 
cution. The N bit of CCR (bit 3) is also set because bit 39 of the result in A is set. The U bit 
of CCR (bit 4) is set because bit 31 and bit 30 of the result are equal. 
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ASR1 6 16-bit Arithmetic Shift Accumulator Right ASR1 6 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF; *; * | *|S1/S0/ 11/10; S;L]E;)U)N) Z) vic 


— Set if the signed integer portion of A or B result is in use 
— Set according to the standard definition of the U bit 

— Set if bit 39 of Aor B result is set 

Set if A or B result equals zero 

— Always cleared 

— Set if bit 15 of A or B was set prior to instruction execution 


O<N2Z2C™M 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: 
ASRi6 D (parallel move) 
Opcode: 
15 12 11 8 7 4 3 0 


0 00 1/0 10 1/0 1 1 1/;F 0 0 0 


Instruction Fields: 


D F 
A 0 
B 1 
Timing: 2 oscillator clock cycles 
Memory: 1 program words 
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BFCHG Test Bit Field and Change BFCHG 


Operation: Assembler Syntax: 


<bit field> of destination 
<bit field> of destination 
<bit field> of destination 
<bit field> of destination 


<bit field> of destination) 
<bit field> of destination) 
<bit field> of destination) 
<bit field> of destination) 


BFCHG  #iliii,X:<aa> 
BFCHG  #iiii,X:<pp> 
BFCHG  #ilii,X:<ea> 
BFCHG ‘#iiii,D 


—_~| | ~| ~~ 


Lib 


( 
( 
( 
( 


CS aS a a 


Description: Test up to 8 bits grouped within a byte of the destination operand, complement them and 
store the result in the destination memory location. The bits to be tested are selected by an 
immediate 16-bit hexadecimal number in which every bit set is to be tested and changed. 
The bits to be tested need to be located in the same byte (low byte for bits 0-7; middle byte 
for bits 4-11; high byte for bits 8-15). This instruction performs a read-modify-write opera- 
tion on the destination memory location or register and requires two destination accesses. 
This instruction is very useful for performing I/O bit manipulation. 


Example: 
BFCHG #$0310,X:<<$FFE2 ;test and change bits 4,8,9 in I/O Port B Data Register 


Before Execution After Execution 


X:$FFE2 0010 X:$FFE2 0300 
0000 0000 


SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 16-bit X memory location X:$FFE2 (I/O Port B Data 
Register) contains the value $0010. Execution of the instruction tests the state of the bits 
4,8,9 in X:$FFE2, does not set the carry bit C in CCR because all of these bits were not set, 
and then complements the bits. 


Condition Codes Affected: 


l< MR Cau CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *;} * | *|S1/S0/ 11/10; S;L)/E;U)N) 2) Vic 


For destination operand SR: 
— Changed if specified in the field 
For other destination operands: 


L — Set if data limiting occurred during 40-bit source move 
C — Set if the all bits specified by the mask are set 
Warning: Bit field instructions should always be used with a mask different from zero. 
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B FCHG Test Bit Field and Change 


Instruction Format and Opcode: 


BFCHG 


BFCHG #illi,X:<aa> 
BFCHG #iili,X:<pp> 
15 12 11 8 7 4 3 0 


P | Destination 
0 00 1;0 10 0/1 1 P pj|p p p p 
X:<aa>5 bit Absolute 
Short Address (aaaaa) 
X:<pp>5 bit I/O Short 
Address = ppppp 


B BB1js;0 0 141 O}i it i tJi i i it 


BFCHG #iiii,X:<ea> 
15 12 11 8 7 4 3 0 
RR Destination 
0001/0 10 0/1 0 1 —|/— — RR 
00 X:(RO) 
01 X:(R1) 
BBBd1i001 0]/i i i ili i i i 10 X:(R2) 
11 X:(R3) 
“—” = don’t care 
BFCHG #iiii, DDDDD 
15 12 11 8 7 4 3 0 
0001/0 10 0/1 0 0 D;)/D DD OD 
B BBd1is0 0 1 O0O/] i I I I I I I 
S |DDDDD |S DDDDD |S DDDDD |S DDDDD 
X0 |00000 |SR 01001 |RO 10000 |SSH;| 11000 
YO |}00001 }OMR|01010 /|RI1 10001 /SSL/11001 
X1|00010 |SP 01011 |R2 10010 |LA 11010 
Y1}00011 |A1 01100 |R3 10011 /LC 01000 
A |00100 |BI1 01101 |MO 10100 |NO 11100 
B 00101 |A2 01110 #/M1 10101 |N1 11101 
AO |00110 |B2 01111 =~|M2 10110 |N2 11110 
BO |}00111 M3 10111 #|N3 11111 
Instruction Fields for second word: BBB _ Field active 
100 upper byte (bit 8-15) 


010 
001 


middle byte (bit 4-11) 
lower byte (bit 0-7) 


Timing: 4 + mvb oscillator clock cycles 
Memory: 2 program words 
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BFCLR Clear Bit Field BFCLR 


Operation: Assembler Syntax: 
0— (<bit field> of destination) BFCLR  #iili,.X:<aa> 
0— (<bit field> of destination) BFCLR  #iili,£X:<pp> 
0-— (<bit field> of destination) BFCLR _ #iili,.X:<ea> 
0— («bit field> of destination) BFCLR_ #iiii,D 


Description: Clear up to 8 bits grouped within a byte of the destination operand and store the result in 
the destination memory location. The bits to be cleared are selected by an immediate 16- 
bit hexadecimal number in which every bit set is to be cleared. The bits to be cleared need 
to be located in the same byte (low byte for bits 0-7; middle byte for bits 4-11; high byte for 
bits 8-15). This instruction performs a read-modify-write operation on the destination mem- 
ory location or register and requires two destination accesses. This instruction is very use- 
ful for performing I/O bit manipulation. 


Example: 
BFCLR #$0310,X:<<$FFE2 ;test and clear bits 4,8,9 in I/O Port B Data Register 


Before Execution After Execution 


X:$FFE2 7F95 X:$FFE2 7085 
0000 0000 


SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 16-bit X memory location X:$FFE2 (I/O Port B Data 
Register) contains the value $7F95. Execution of the instruction tests the state of the bits 
4,8,9 in X:$FFE2, clear the carry bit C in CCR because not all these bits were set, and then 
clears the bits. 


Condition Codes Affected: 


MR ————>_CCR ——————> 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


LF}; *;} * | *|S1/S0/ 11/10; S;/L)/E;U;)N) 2) Vv)c 


For destination operand SR: 
— Cleared as defined in the field and if specified in the field 
For other destination operands: 


L — Set if data limiting occurred during 40-bit source move 
C — Set if the all bits specified by the mask are set 
Clear if the not all bits specified by the mask are set 
Warning: Bit field instructions should always be used with a mask different from zero. If the mask is 


zero, the instruction essentially executes two NOPs. 
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BFCLR 


BFCLR 


Instruction Format and Opcode: 


Clear Bit Field 


BFCLR #ilii,X:<aa> 
BFCLR #iili,X:<pp> 
15 12 11 8 7 4 3 0 
P | Destination 
0 0 1 1 =P 
POR oP PB 0 | X:<aa>5 bit Absolute 


Short Address (aaaaa) 
1 | X:<pp>5 bit I/O Short 
Address = ppppp 


BFCLR #iiii,X:<ea> 
15 12 11 8 7 4 3 0 
RR Destination 
0001/0 10 0/1 0 1—|—— RR 
00 X:(RO) 
01 X:(R1) 
BB BO/O 10 0/i i i ifi i i i 10 X:(R2) 
11 X:(R3) 
“—” = don’t care 
BFCLR #iiii,DDDDD 
15 12 11 8 7 4 3 0 
0001/0 10 0/1 00 D}D DD OD 
B BB O|0O0 1 0 0/i i i ij}i i i i 
S /;DDDDD |S DDDDD |S DDDDD |S DDDDD 
X0 }00000 |SR 01001 |RO 10000 |SSH|11000 
YO |}00001 |}OMR|01010 |RI1 10001 |SSL|11001 
X1|}00010 |SP 01011 =|R2 10010 |LA 11010 
Y1 }00011 |At1 01100 |R3 10011 |LC 01000 
A |00100 /|BI1 01101 #|MO 10100 |NO 11100 
B |00101 |A2 01110 |MI1 10101 |Nt1 11101 
AO }00110 /|B2 01111 =|M2 10110 |N2 11110 
BO |00111 M3 10111 |N3 111141 
Instruction Fields for second word: BBB _ Field active 
100 upper byte (bit 8-15) 
010 middle byte (bit 4-11) 
001 lower byte (bit 0-7) 


Timing: 4 + mvb oscillator clock cycles 
Memory: 2 program words 
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BFSET Set Bit Field BFSET 


Operation: Assembler Syntax: 
1— (<bit field> of destination) BFSET  #iiii,X:<aa> 
1—> (<bit field> of destination) BFSET  #iili,£X:<pp> 
1— (<bit field> of destination) BFSET  #iiii,X:<ea> 
1— (<bit field> of destination) BFSET  #iiii,D 


Description: Set up to 8 bits grouped within a byte of the destination operand and store the result in the 
destination memory location. The bits to be set are selected by an immediate 16-bit hexa- 
decimal number in which every bit set is to be tested and set. The bits to be set need to be 
located in the same byte (low byte for bits 0-7; middle byte for bits 4-11; high byte for bits 
8-15). This instruction performs a read-modify-write operation on the destination memory 
location or register and requires two destination accesses. This instruction is very useful for 
performing I/O bit manipulation. 


Example: 


BFSET #$F400,X:<<$FFE2 test and set bits 10,12,13,14,15 in I/O Port B 
;Data Register 


Before Execution After Execution 


X:$FFE2 8921 X:$FFE2 FD21 
0000 0000 


SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 16-bit X memory location X:$FFE2 (I/O Port B Data 
Register) contains the value $8921. Execution of the instruction tests the state of bits 
10,12,13,14,15 in X:$FFE2, does not set the carry bit C in CCR because all these bits were 
not set, and then sets the bits. 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *|} * | *|S1/S0/ 11/10; S;L)/E;U;)N) 2) Vic 


For destination operand SR: 
— Set as defined in the field and if specified in the field 
For other destination operands: 


L — Set if data limiting occurred during 40-bit source move 
C — Set if the all bits specified by the mask are set 
Warning: Bit field instructions should always be used with a mask different from zero. If the mask is 


zero, the instruction essentially executes two NOPs. 
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BFSET Set Bit Field BFSET 
Instruction Format and Opcode: 
BFSET #iiii,X:<aa> 
BFSET #iili,X:<pp> 
15 12 11 8 7 4 3 0 
P | Destination 
0 00 1;0 1 0 0/1 141 P pj|p p p p 
X:<aa>5 bit Absolute 
Short Address (aaaaa) 
BBB#dii1 00 0/;i i i ifi i i i X:<pp>5 bit I/O Short 
Address = ppppp 
BFSET #iili,X:<ea> 
15 12 11 8 7 4 3 0 
Destination 
0 0 07/1 01 — 


“—” = don’t care 
BFSET #iiii, DDDD 
15 12 11 8 7 4 3 0 
0 00 1/0 1 0 0/1 0 D|JD D D D 
B BBd1itist 0 0 OO}; i i tdi i 
S |DDDDD |S DDDDD |S DDDDD |S DDDDD 
XO |00000 |SR 01001 RO 10000 |SSH|11000 
YO |}00001 |OMR|01010 /|RI1 10001 /|SSL/11001 
X1 }00010 |SP 01011 R2 10010 |LA 11010 
Y1}00011 |A1 01100 |R3 10011 |LC 01000 
A 00100 |B1 01101 |MO 10100 |NO 11100 
B 00101 |A2 01110 #/M1 10101 #|N1 11101 
AO |00110 |B2 01111 =~|M2 10110 |N2 11110 
BO |}00111 M3 10111 #|N3 11111 
Instruction Fields for second word: BBB _ Field active 
100 upper byte (bit 8-15) 
010 middle byte (bit 4-11) 
001 lower byte (bit 0-7) 


Timing: 4 + mvb oscillator clock cycles 
Memory: 2 program words 
MOTOROLA INSTRUCTION SET A - 43 
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BFTISTH Test Bit Field High BFTISTH 


Operation: Assembler Syntax: 
<bit field> of destination BFTSTH  #iili,X:<aa> 
<bit field> of destination BFTSTH  #iiii,X:<pp> 
<bit field> of destination BFTSTH  #iili,X:<ea> 
<bit field> of destination BFTSTH  #iiii,D 


Description: Test high up to 8 bits grouped within a byte of the destination operand. The bits to be tested 
are selected by an immediate 16-bit hexadecimal number in which every bit set is to be test- 
ed. The bits to be tested need to be located in the same byte (low byte for bits 0-7; middle 
byte for bits 4-11; high byte for bits 8-15). If all the bits tested were high, the C condition bit 
is set. This instruction is very useful for performing I/O flag polling. 


Example: 
BFTSTH #$0310,X:<<$FFE2 _ ;test high bits 4,8,9 in I/O Port B Data Register 


Before Execution After Execution 


X:$FFE2 OFFO X:$FFE2 OFFO 
0000 0001 


SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 16-bit X memory location X:$FFE2 (I/O Port B Data 
Register) contains the value $0FFO. Execution of the instruction tests the state of bits 4,8,9 
in X:$FFE2 and sets the carry bit C in CCR because all these bits were set. 


Condition Codes Affected: 


MR ———>_CCR —————— 


15 1413 12 1110 9 8}7 6 5 4 3 2 1 =O 


LF}; *; * | *|S1/S0/ 11/10; S;/L)/E;)U)N) 2) Vv) Cc 


L — Set if data limiting occurred during 40-bit source move 
C — Set if the all bits specified by the mask are set 
WARNING: Bit field instructions should always be used with a mask different from zero. If the mask is 


zero, the instruction essentially executes two NOPs. 
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BFISTH 


Instruction Format and Opcode: 


BFISTH 


Test Bit Field High 


BFTSTH _ #ilii,X:<aa> 
BFTSTH _ #iiii,X:<pp> 
15 12° 41 8 7 4 3 0 


Destination 


0001/0 1 0 0/0 1 
0 | X:<aa>5 bit Absolute 


Short Address (aaaaa) 


BBeB#d1tshoob OO Ot F fF PT)i bt tr i 1 | X:<pp>5 bit I/O Short 
Address = ppppp 
BFTSTH #iili,X:<ea> 
15 12 11 8 7 4 3 0 
RR Destination 
0001/0 1 0 0/0 0 1—}]— — R R 
00 X:(RO) 
01 X:(R1) 
B BEA 6 0 O14 ft Fb PVG 10 X:(R2) 
11 X:(R3) 
“—” = don’t care 
BFTSTH #iiii, DDDDD 
15 12 11 8 7 4 3 0 
0001/0 10 0/0 00 D}D D D D 
B BB1s0 0 0 O} i i if|i i i i 
S |DDDDD |S DDDDD |S DDDDD |S DDDDD 
XO |00000 |SR 01001 |RO 10000 |SSH|11000 
YO |}00001 |OMR|01010 /|RI1 10001 /SSL/11001 
X1|}00010 |SP 01011 |R2 10010 |LA 11010 
Y1}00011 |At 01100 |R3 10011 #|LC 01000 
A |00100 |BI1 01101 |MO 10100 |NO 11100 
B 00101 |A2 01110 #|M1 10101 |N1 11101 
AO }00110 |B2 01111 =|M2 10110 |N2 11110 
BO |}00111 M3 10111 #|N3 11111 
Instruction Fields for second word: BBB _ Field active 
100 upper byte (bit 8-15) 
010 middle byte (bit 4-11) 
001 lower byte (bit 0-7) 


Timing: 4 + mvb oscillator clock cycles 
Memory: 2 program words 
MOTOROLA INSTRUCTION SET A-45 


For More Information On This Product 
Go to: www.freescale.com 


SL, 


BFISTL Test Bit Field Low BFISTL 


Operation: Assembler Syntax: 
<bit field> of destination BFTSTL #iiii,X:<aa> 
<bit field> of destination BFTSTL #iili,£X:<pp> 
<bit field> of destination BFTSTL #iiii,X:<ea> 
<bit field> of destination BFTSTL #iiii,D 


Description: Test low_up to 8 bits grouped within a byte of the destination operand. The bits to be tested 
are selected by an immediate 16-bit hexadecimal number in which every bit set is to be test- 
ed. The bits to be tested need to be located in the same byte (low byte for bits 0-7; middle 
byte for bits 4-11; high byte for bits 8-15). If all the bits tested were low, the C condition bit 
is set. This instruction is very useful for performing 1/O flag polling. 


Example: 
BFTSTL #$0310,X:<<$FFE2 _ ;test low bits 4,8,9 in I/O Port B Data Register 


Before Execution After Execution 


X:$FFE2 18EC X:$FFE2 18EC 
0000 0001 


SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 16-bit X memory location X:$FFE2 (I/O Port B Data 
Register) contains the value $18EC. Execution of the instruction tests the state of bits 4,8,9 
in X:$FFE2 and sets the carry bit C in CCR because all these bits were cleared. 


Condition Codes Affected: 


k———_MR ————>_CCR —————— 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


LF}; *;} * | *|S1/S0/ 11/10; S;};/L)/E)U;)N) 2) Vic 


L — Set if data limiting occurred during 40-bit source move 
C — Set if the all bits specified by the mask are cleared 
WARNING: Bit field instructions should always be used with a mask different from zero. If the mask is 


zero, the instruction essentially executes two NOPs. 
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BFISTL 


Instruction Format and Opcode: 


BFTSTL 


Test Bit Field Low 


BFTSTL __#ilili,X:<aa> 
BFTSTL _#illi,X:<pp> 
15 12°44 8 7 4 3 0 


P | Destination 


0001/0 10 0/0 1 P pip p p p 
0 | X:<aa>5 bit Absolute 


Short Address (aaaaa) 


B BB O/]0O 00 0/i i i ifi i i i 1 | X:<pp>5 bit I/O Short 
Address = ppppp 
BFTSTL #iili,X:<ea> 
15 12 11 8 7 4 3 0 
RR Destination 
0001/0 1 0 0/0 0 1—}]— — R R 
00 X:(RO) 
01 X:(R1) 
BBEwW/0 6 0 O14 ft Fb PVG 10 X:(R2) 
11 X:(R3) 
“—” = don’t care 
BFTSTL #iiii, DDDDD 
15 12 11 8 7 4 3 0 
0001/0 10 0/0 00 D}D D D D 
B BB O/;O0O 0 0 O/} i i ifi it ait 
S |DDDDD |S DDDDD |S DDDDD |S DDDDD 
XO |00000 |SR 01001 |RO 10000 |SSH|11000 
YO |}00001 |OMR|01010 /|RI1 10001 |SSL/11001 
X1 |}00010 |SP 01011 =|R2 10010 |LA 11010 
Y1}00011 |At 01100 |R3 10011 #|LC 01000 
A |00100 |B1 01101 |MO 10100 |NO 11100 
B 00101 |A2 01110 #|M1 10101 |N1 11101 
AO }/00110 |B2 01111 =|M2 10110 |N2 11110 
BO |}00111 M3 10111 #|N3 11111 
Instruction Fields for second word: BBB _ Field active 
100 upper byte (bit 8-15) 
010 middle byte (bit 4-11) 
001 lower byte (bit 0-7) 


Timing: 4 + mvb oscillator clock cycles 
Memory: 2 program words 
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Bcc Branch Conditionally Bcc 


Operation: Assembler Syntax: 
Ifcc, then PC+label — PC Bcc XXXX 

else PC+1 — PC Bcc ee 
Ifcc, then PC+Rn — PC Bcc Rn 


else PC+1 > PC 
Description: _ If the specified condition is true, program execution continues at location PC+displace- 
ment. The PC contains the address of the next instruction. If the specified condition is false, 
the program counter (PC) is incremented and program execution continues sequentially. 
Short displacement (6 bit signed value), long displacement (16 bit signed value) and ad- 
dress register PC relative addressing modes may be used. The 6-bit data is signed extend- 
ed to form the effective address. 


The term “cc” may specify the following conditions: 


“cc” Mnemonic Condition 


CC (HS) —carry clear (higher or same) 
CS (LO) —carry set(lower) 
— extension clear 


— equal 

— extension set 

— greater than or equal 
— greater than 


— limit clear 

— less than or equal 
— limit set 

— less than 

— minus 

— not equal 

— normalized 

— plus 

— not normalized 


® 


N= 
So —_ 


aac 
mo Mm 
1 


ll 
Oo 


denotes the logical OR operator, 


where: U denotes the logical complement of U, 
+ 
7 denotes the logical AND operator, 


® denotes the logical Exclusive OR operator 
Restrictions: — A Bcc instruction used within a DO loop cannot begin at the address LA within 
that DO loop. 
— A Bcc instruction cannot be repeated using the REP instruction. 
— Not allowed between addresses P:$0 and P:$40. 
Example: 
BNN R2 ;jump to P:(PC+R2) if not normalized 


Explanation of Example: In this example, program execution is transferred to the address P:(PC+R2) if 
the result is not normalized. If the specified condition is not true, no jump is taken and the 
program counter is incremented by one. 
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Bcc Bcc 


Condition Codes Affected: 
The condition codes are not affected by this instruction. 


Branch Conditionally 


Instruction Format and Opcode: 


Bcc XXXX 
15 12 11 8 7 4 3 0 
0 0 0 0;0 1:41 =4/— — 1+414/¢C CGC cic 
Xx xX xX xX/|xX xX xX XIX xX xX XIX xX xX xX 
“—” = don’t care 


Instruction Fields: xxxx = 16-bit signed relative branch address 


Timing: 4+ jx oscillator clock cycles Memory: 2 program words 
Instruction Format and Opcode: 
Bcc aa 

15 12 11 8 7 4 3 0 


Instruction Fields: ee = 6-bit signed relative short branch address 


Timing: 4 + jx oscillator clock cycles Memory: 1 program word 
Instruction Format and Opcode: 
Bcc Rn 
15 12°11 8 7 4 3 RR Rn 
000 0/0 114 1/R O|c cc 00 a 
01 R1 
10 R2 
11 R3 
Timing: 4 + jx oscillator clock cycles 
Memory: 1 program word 


Instruction Fields: 
Mnemonic 


cc = 4-bit condition code = cccc 


Mnemonic 


CC(HS) 
GE 
NE 
PL 
NN 
EC 
LC 
GT 


OOO OOO 0 oO 


a a ee > i o> al a> ae a>} 


—- =- Oo —+ + O OO 


- ort ort O+- 0 


CS(LO) 
LT 

EQ 

MI 

NR 

ES 

LS 

LE 


-—-—- = = OOOO 


=" = Oo--T=+ O00 


- O- O- OF - © 
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INSTRUCTION SET 


n This Product 


BRA Branch BRA 


Operation: Assembler Syntax: 
PC+label > PC BRA — Xxxx 

BRA aa 
PC+Rn —PC BRA Rn 


Description: Branch to the location in program memory at location PC+displacement. The PC contains 
the address of the next instruction. Short displacement (8 bit signed value), long displace- 
ment (16-bit signed value) and address register PC relative addressing modes may be 
used. The 8-bit data is signed extended to form the effective address. 


Restrictions: — A BRA instruction used within a DO loop cannot begin at the address LA within that DO 
loop. 


— A BRA instruction cannot be repeated using the REP instruction. 
— Not allowed between addresses P:$0 and P:$40. 
Example: 
BRA R2 ;jump to P:(PC+R2) 


Explanation of Example: 
In this example, program execution is transferred to the address P:(PC+R2) 


Condition Codes Affected: 


The condition codes are not affected by this instruction. 
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| 
BRA Branch BRA 


Instruction Format and Opcode: 


BRA XXXX 
15 12 11 8 7 4 3 0 
0 00 0/0 0011/0 01 4/1 4— — 
xX xX xX X}|xX xX xX XIX xX xX XIX xX XxX xX 
“—” = don’t care 


Instruction Fields: xxxx = 16-bit signed relative branch address 


Timing: 4 + jx oscillator clock cycles 
Memory: 2 program words 
BRA aa 
15 12 11 8 7 4 3 0 


0 0 0 0}1 0 1 4}/a aaasjfaaaa 


Instruction Fields: aa = 8-bit signed relative short branch address 


Timing: 4 + jx oscillator clock cycles 
Memory: 1 program word 
BRA Rn 
15 12 11 8 7 4 3 0 RR Rn 
000 0/0 00 1/0 01 0/1 1 R RI }2 RO 
01 R1 
10 R2 
11 R3 
Timing: 4 + jx oscillator clock cycles 
Memory: 1program word 
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Nine 


BRKcc 


Operation: 


lf cc, then 


else 


Description: 


The term “cc” may specify the following conditions: 


LA+1—PC; SSL(LF,FV) — SR; SP-1 > SP; 


SSH > LA; SSL LC; SP-1 > SP 
PC+1 — PC 


Exit Current DO Loop Conditionally 


BRKcc 


Assembler Syntax: 


BRKcc 


Exit conditionally the current hardware DO loop before the current loop counter (LC) equals 
one. It also terminates the DO FOREVER loop. If the value of the current DO loop counter 
(LC) is needed, it must be read before the execution of the BRKcc instruction. Initially, the 
PC is updated from the LA, the loop flag (LF) and the ForeVer flag (FV) are restored and 
the remaining portion of the status register (SR) is purged from the system stack. The loop 
address (LA) and the loop counter (LC) registers are then restored from the system stack. 


“cc” Mnemonic 


Condition 


— carry clear (higher or same) 
— carry set(lower) 

— extension clear 

— equal 

— extension set 

— greater than or equal 
— greater than 


— limit clear 

— less than or equal 
— limit set 

— less than 


N @ V=0 
Z+(N @ V)=0 
L=0 
Z+(N © V)=1 


- 
Il 


Zz 
<-t 
un 


— minus 

— not equal 

— normalized 

— plus 

— not normalized 


N 
= 
AZAaNZ0 
on 


e il e 
mo Mm 
T 


N 
a 
i 
ro) 


where: U denotes the logical complement of U, 
+ denotes the logical OR operator, 
. denotes the logical AND operator, 
® denotes the logical Exclusive OR operator 


Restrictions: Due to pipelining and the fact that the BRKcc instruction accesses the program controller reg- 
isters, the BRKcc instruction must not be immediately preceded by any of the following instructions: 
MOVEC to LA, LC, SR, SSH, SSL or SP 
MOVEC from SSH 
ORI MR 
ANDI MR 
Also, the BRKcc instruction cannot be the next to last instruction in a DO loop (at LA-1). It cannot be the 
only instruction of a DO loop. 
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NOL 


BRKcc Exit Current DO Loop Conditionally BRKcc 


Example: 
DO YO,END_LP exec. loop ending at END_LP (YO) times 
MOVEC LC,A ;get current value of loop counter (LC) 
CMP Y1,A ;compare loop counter with value in Y1 
BRKNE ;go to first instruction after Do loop if LC not equal to Y1 
; ;(last instruction word in DO loop) 
END_LP MOVE #$123456,X1 (first instruction AFTER DO loop) 


Explanation of Example: This example illustrates the use of the BRKcc instruction to terminate the cur- 
rent DO loop. The value of the loop counter (LC) is compared with the value in the Y1 reg- 
ister to determine if execution of the DO loop should continue. Note that the BRKcc instruc- 
tion updates certain program controller registers and automatically jumps past the end of 
the DO loop. Thus, no JMP/BRA instruction needs to be included after the BRKcc to trans- 
fer program control to the first instruction past the end of the DO loop. 


Condition Codes Affected: 
The condition codes are not affected by this instruction. 
Instruction Format: 


BRKcc 
Opcode: 


15 12 11 8 7 4 3 0 


0 0 0 0/0 0 0 1 


Instruction Fields: 


cc = 4-bit condition code = cccc 


Mnemonic Mnemonic 

CC(HS) 0 | CS(LO) 

GE 1 LT 

NE 0 EQ 

PL 1 MI 

NN 0 NR 

EC 1 ES 

LC 0 LS 

GT 1 LE 
Timing: 2 oscillator clock cycles when cc not true; 8 oscillator clock cycles when cc true 
Memory: 1 program word 
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(Lenin 


BScc Branch to Subroutine Conditionally BScc 


Operation: Assembler Syntax: 
lfcc, then SP+1 — SP BScc  XXxx 
PC — SSH 
SR —> SSL 
PC+xxxx —> PC 
else PC+1 — PC 
Ifcc, then SP+1 — SP BScc Rn 
PC — SSH 
SR —> SSL 
PC+Rn —>PC 
else PC+1 — PC 


Description: _ If the specified condition is true, program execution continues at location PC+displace- 
ment. The PC contains the address of the next instruction. If the specified condition is false, 
the program counter (PC) is incremented and program execution continues sequentially. 
Long displacement (16 bit signed value) and address register PC relative addressing 
modes may be used. 


The term “cc” may specify the following conditions: 


“cc” Mnemonic Condition 


— carry clear (higher or same) 
— carry set(lower) 

— extension clear 

— equal 

— extension set 

— greater than or equal 

— greater than 


— limit clear 

— less than or equal 
— limit set 

— less than 

— minus 

— not equal 

— normalized 

— plus 

— not normalized 


® 


IN a 
on 


—~ 
VY 
ll 

— 


G2 Cl 
milo mi 


N 
ae 
T 

ro) 


denotes the logical OR operator, 
denotes the logical AND operator, 
denotes the logical Exclusive OR operator 
Restrictions: — A BScc instruction used within a DO loop cannot begin at the address LA within that 
DO loop. 
— ABScc instruction used within a DO loop cannot specify the loop address LA as its tar- 
get. 
— ABScc instruction cannot be repeated using the REP instruction. 
— Not allowed between addresses P:$0 and P:$40. 


where: U denotes the logical complement of U, 
+ 


® 


A-54 INSTRUCTION SET MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


SE: 


BScc 


BScc 


Example: 


Branch to Subroutine Conditionally 


BSLS R2 ;jump to subroutine at P:(PC+R2) if limit set 


Explanation of Example: In this example, program execution is transferred to the subroutine at address 
P:(PC+R2) if the limit bit is set. If the specified condition is not true, no jump is taken and 
the program counter is incremented by one. 


Condition Codes Affected: 
The condition codes are not affected by this instruction. 


Instruction Format and Opcode: 


BScc XXXX 
15 12 11 8 7 4 3 0 
0 00 0;0 114 4/—— 0 1/c¢ C ce 
Xx xX X X]|X XxX xX X|xX xX X X}X xX xX xX 
“—” = don’t care 
Instruction Fields: xxxx = 16-bit signed relative branch address 
Timing: 4 + jx oscillator clock cycles Memory: 2 program words 
Instruction Format and Opcode: 
BScc Rn 
15 12 11 8 7 4 3 0 RR Rn 
000 0j0 1% 14/R RO O]le c c cl | ne 
01 R1 
10 R2 
11 R3 
Timing: 4 + jx oscillator clock cycles 
Memory: 1 program words 


Instruction Fields: 
cc = 4-bit condition code = cccc 


Mnemonic 


Mnemonic 


CC(HS) 
GE 
NE 
PL 
NN 
EC 
LC 
GT 


oOOoOO O00 0 0 


i i ne > i o> a a> an aD} 


=—- =- Oo —+ + O OO 


-ortor-tOoO+- oO 


CS(LO) 
LT 

EQ 

MI 

NR 

ES 

LS 

LE 


=-—---4+-0000O 


- = Oo--+--+ 00 


=—- oO - O- Oo $$ 00 
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BSR 


Assembler Syntax: 


SP+1 — SP 
PC — SSH 
SR > SSL 
PC+xxxx —>PC 
SP+1 — SP 
PC — SSH 
SR > SSL 


PC+Rn —>PC 


Branch to Subroutine BSR 


Operation: 


BSR XXXX 


BSR Rn 


Description: Branch to subroutine in program memory at location PC+displacement. The PC contains 
the address of the next instruction. Long displacement (16 bit signed value) and address 
register PC relative addressing modes may be used. 


Restrictions: — 


Example: 
BSR 


A BSR instruction used within a DO loop cannot begin at the address LA within that DO 
loop. 


A BSR instruction used within a DO loop cannot specify the loop address LA as its tar- 
get. 


A BSR instruction cannot be repeated using the REP instruction. 
Not allowed between addresses P:$0 and P:$40. 


R2 ;jump to P:(PC+R2) 


Explanation of Example: 
In this example, program execution is transferred the subroutine at address P:(PC+R2) 


Condition Codes Affected: 


The condition codes are not affected by this instruction. 
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| 
BSR Branch to Subroutine BSR 


Instruction Format and Opcode: 


BSR XXXX 
15 12 11 8 7 4 3 0 
0 00 0/0 0011/0 01 14/1 0— — 
XxX xX xX X|xX xX xX XIX xX xX XIX xX Xx xX 
“—” = don’t care 


Instruction Fields: 


XXXX = 16-bit signed relative branch address 


Timing: 4 + jx oscillator clock cycles 
Memory: 2 program words 


Instruction Format and Opcode: 


BSR Rn 
15 12 14 8 7 4 3 0 RR Rn 
000 0/0 0 01/0 01 0/1 0 R RI | RO 
01 R1 
10 R2 
11 R3 
Timing: 4 + jx oscillator clock cycles 
Memory: 1 program words 
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CH KAAU Check Address ALU Result CH KAAU 


Operation: Assembler Syntax: 

Affects V, Z and N bit of CCR according to last Address ALU result CHKAAU (no parallel move) 

Description: Update the V, Z, and N flags in the CCR according to the result of the address calculation. 
Only alterable addressing modes will give meaningful flag updates. When the last address 


ALU operation was performed on a double read, the update of the CCR is done according 
to the result on the first address ALU register. 


Example: 
CHKAAU 
Explanation of Example: see above description. 


Condition Codes Affected: 


l< MR cau CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *;} * | *|S1/S0/ 11/10; S};L)E;)U)N) Z) VC 


N — Set if bit 15 (MSB) of the result of the address calculation with linear or modulo 
modifier is set. Cleared otherwise. 


Z — Set if result of the address calculation equals zero. Cleared otherwise. 


V — Set if overflow occurred out the MSB during address calculation with linear modifi- 
er. Set if wraparound occurred during address calculation with modulo modifier. 
Cleared otherwise. 


Notes: 1. When CHKAAU is used after a double parallel memory read, the first memory read 
(i.e., the read not addressed by R3) will affect the flags. 


2. When CHKAAU is used after an LEA, the condition codes will not be affected. 
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C H KAA U Check address ALU result C H KAA U 


Instruction Format: 


CHKAAU 
Opcode: 
15 12 11 8 7 4 3 0 
0 00 0/0 0 0 0/0 0 0 0/0 1 0 0 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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C LR Clear Accumulator C LR 


Operation: Assembler Syntax: 


0 —D (parallel move) CLR D (parallel move) 


Description: Clear the destination accumulator. This is a 40-bit clear instruction. 


Example: 
CLR A A,X0 save A into XO before clearing it 
Before Execution After Execution 
12 3456 789A 00 0000 0000 
A2 Al AO A2 Al AO 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $12:3456:789A. 
Execution of the CLR A instruction clears the 40-bit A accumulator to zero. 


Condition Codes Affected: 


MR CCR 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


S — Computed according to the standard definition (see section A.4) 
L  — Set if data limiting has occurred during parallel move 
E — Always cleared 
U — Always set 
N — Always cleared 
Z — Always set 
V — Always cleared 
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C LR Clear Accumulator C LR 


Instruction Format: 


CLR D (parallel move) 
Opcode: 


15 12 11 8 7 4 3 0 
imar|yw wwe oo oF oo: 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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SE: 


CLR24 Clear 24 MS-bits of Accumulator CLR24 


Operation: Assembler Syntax: 


0 > bit 16-39 of D (parallel move) CLR24 D (parallel move) 


Description: Clear the 24 MS bit of the destination accumulator. This is a 24-bit clear instruction. 


Example: 
CLR24 A X:(B1),X1 clear 24 MS bit of A; update X1 
Before Execution After Execution 
12 3456 789A 00 0000 789A 
A2 Al AO A2 Al AO 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $12:3456:789A. 
Execution of the CLR24 A instruction clears the 24 MS bits of the accumulator A. 


Condition Codes Affected: 


MR CCR 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


S — Computed according to the standard definition (see section A.4) 
L  — Set if data limiting has occurred during parallel move 
E — Always cleared 
U — Always set 
N — Always cleared 
Z — Always set 
V — Always cleared 
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NCL 


CLR24 Clear 24 MS-bits of Accumulator CLR24 


Instruction Format: 


CLR24 D (parallel move) 
Opcode: 


15 12 11 8 7 4 3 0 
marin w wwe to s/F oo: 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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CMP Compare CMP 


Operation: Assembler Syntax: 


D-S (parallel move) CMP S,D (parallel move) 


Description: Subtract the two operands and update the condition code register. The result of the sub- 
traction operation is not stored. 


Note: This instruction subtracts 40-bit operands. When a word is specified as S, it is sign extended and 
zero filled to form a valid 40-bit operand. In order for the carry to be set correctly as a result of the 
subtraction, D must be properly sign extended. D can be improperly sign extended by writing A1 
or B1 explicitly prior to executing the compare so that A2 or B2, respectively, may not represent the 
correct sign extension. This note particularly applies to the case where it is extended to compare 
16-bit operands such as XO with A1. 


Example: 
CMP YO,A X0,X:(R1)+N1 ;comp. YO and A, save XO 
Before Execution After Execution 

00 0020 0000 00 0020 0000 

A2 Al AO A2 Al AO 
0024 0024 
YO YO 
0300 0319 

SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $00:0020:0000 
and the 16-bit YO register contains the value $0024. Execution of the CMP YO,A instruction 
automatically appends the 16-bit value in the YO register with 16 LS zeros, sign extends the 
resulting 32-bit long word to 40 bits, subtracts the result from the 40-bit A accumulator and 
updates the condition code register leaving accumulator A unchanged. 
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| 
CMP Compare CMP 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 =O 


LF}; *|} * | *|S1/S0/ 11/10; S;/L)/E;)U)N)Z)Vvic 


— Computed according to the standard definition (see section A.4) 
— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of the result is in use 

— Set if result is unnormalized 

Set if bit 39 of the result is set 

— Set if result equals zero 

— Set if overflow has occurred in result 

— Set if a carry (or borrow) occurs from bit 39 of the result 


O<NZCMmrOD 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: 


CMP S,D (parallel move) 
Opcode: 


15 12 11 8 7 4 83 0 


1 m R RJH H H W/O 1 0 1)/F J J oJ 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


S$, |JJJ F| SD |JJJ F 
BA |000 Oj YOB{101 1 
A,B 000 1} X1,A/110 0 
X0,A/100 OO; X1,.B/110 1 
X0,B }/100 1] Y1,AA/111 0 
YOA /}101 O/; Y1,B)/111 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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CM PM Compare Magnitude CM PM 


Operation: Assembler Syntax: 


|D| - [S| (parallel move) CMPM_ §,D (parallel move) 


Description: Subtract the two operands and update the condition code register. The result of the sub- 
traction operation is not stored. 


Note: This instruction subtracts absolute values (magnitude) of 40-bit operands. When a word is specified 
as S, it is sign extended and zero filled to form a valid 40-bit operand. In order for the carry to be 
set correctly as a result of the subtraction, D must be properly sign extended. D can be improperly 
sign extended by writing A1 or B1 explicitly prior to executing the compare so that A2 or B2, respec- 
tively, may not represent the correct sign extension. This note particularly applies to the case 
where it is extended to compare 16-bit operands such as XO with A1. 


Example: 
CMPM YO,A X:(B1),X1 ;comp. |YO| and |A|, update X1 
Before Execution After Execution 
00 0006 0000 00 0006 0000 
A2 Al AO A2 Al AO 
FFF7 FFF7 
YO YO 
0000 0019 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $00:0006:0000 
and the 16-bit YO register contains the value $FFF7. Execution of the CMPM Y0,A instruc- 
tion automatically appends the 16-bit value in the YO register with 16 LS zeros, sign extends 
the resulting 32-bit long word to 40 bits, takes the absolute value of the resulting number, 
subtracts the result from the absolute value of the 40-bit A accumulator and updates the 
condition code register leaving the accumulator A unchanged. 
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| 
CMPM Compare Magnitude CMPM 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 =O 


LF}; *|} * | *|S1/S0/ 11/10; S;/L)/E;)U)N)Z)Vvic 


— Computed according to the standard definition (see section A.4) 
— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of the result is in use 

— Set if result is unnormalized 

Set if bit 39 of the result is set 

— Set if result equals zero 

— Set if overflow has occurred in result 

— Set if a carry (or borrow) occurs from bit 39 of the result 


O<NZCMmrOD 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: 


CMPM $,D (parallel move) 
Opcode: 


15 12 11 8 7 4 83 0 


1 m R RJH H H W]O 1 1 1)F J J oJ 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


S,D |JJJ F| SD |JJJ F 
BA |000 Oj YOB {101 1 
AB |000 1] X1,AA 4110 0O 
X0A/100 OO; X1,.B/110 1 
X0O,B }/100 1] Y1A/111 0 
YOA /}/101 0O/; Y1,BB/111 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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DEBUG Enter Debug Mode DEBUG 


Operation: Assembler Syntax: 


Enter the debug mode DEBUG 


Description: Enter the debug mode and wait for ONCE commands. 


Condition Codes Affected: 
Not affected 


A - 68 INSTRUCTION SET MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


DEBUG Enter Debug Mode DEBUG 


Instruction Format: 


DEBUG 
Opcode: 


15 12 11 8 7 4 3 0 


0 00 0/0 00 0/0 0 0 0;0 0 0 1 


Timing: 4 oscillator clock cycles 
Memory: 1 program word 
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DEBUGCC Enter Debug Mode Conditional DEBUGcc 


Operation: Assembler Syntax: 


Ifcc, then enter the debug mode DEBUGcc 
else PC+1—>PC 


Description: _ If the specified condition is true, enter the debug mode and wait for OnCE commands. If 
the specified condition is false, continue with the next instruction. 


The term “cc” may specify the following conditions: 


“cc” Mnemonic Condition 


— carry clear (higher or same) 

— carry set(lower) 

— extension clear 

— equal 

— extension set 

— greater than or equal N © V=0 
— greater than Z+(N @ V)=0 


— limit clear 

— less than or equal 
— limit set 

— less than 

— minus 

— not equal 

— normalized 

— plus 

— not normalized 


® 


NZ 
T Tlo = 
ll 
— 


Giz Cl 
milo mi 


N 
oo 
i 
Oo 


denotes the logical OR operator, 


where: U denotes the logical complement of U, 
+ 
° denotes the logical AND operator, 


® denotes the logical Exclusive OR operator 
Example: The following is an example on conditional breakpoint setting using Debugcc: 
A conditional breakpoint can be set By replacing the MAC instruction by 
on the MAC instruction of the fol- a JSR instruction as follows: 
lowing sequence of code: : 
: ASR4 A 
ASR4 A JSR Break 
MAC X0,Y1,A ADD X1,A 
ADD X1,A : 
Break DEBUGcc 
MAC X0,Y1,A 
RTS 
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DEBUGCC Enter Debug Mode Conditional DEBUGcc 


Condition Codes Affected: 
Not affected 


Instruction Format: 


DEBUGcc 
Opcode: 


15 12 11 8 7 4 3 0 


0 00 0/0 0 0 0/0 1 0 1/c c cc 


Instruction Fields: 


cc = 4-bit condition code = cccc 


Mnemonic Mnemonic 


oOoOOCO O00 0 Oo 


0 
0 
0 
0 
1 
1 
1 
1 


=—- =- Oo —+ + O OO 
- ort ort O+- 0 
—=-— = 42 0O000 
==." OO--- O00 
=- OF OF + O + © 


Timing: 4 oscillator clock cycles 
Memory: 1 program word 
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DEC Decrement Accumulator D EC 


Operation: Assembler Syntax: 


D-1 —>D (parallel move) DEC D (parallel move) 


Description: Decrement by one the destination accumulator. This is a 40-bit decrement instruction. 


Example: 
DEC A A,X0 save A into XO before decrementing it 
Before Execution After Execution 
12 3456 789A 12 3456 7899 
A2 Al AO A2 Al AO 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $12:3456:789A. 
Execution of the DEC A instruction decrements by one the 40-bit A accumulator. 


Condition Codes Affected: 


MR CCR 


15 1413 12 1110 9 8}7 6 5 4 3 2 1 =O 


S — Computed according to the standard definition (see section A.4) 
L — Set if limiting (parallel move) or overflow has occurred in result 
E — Setif the signed integer portion of the result is in use 
U — Set if result is unnormalized 
N — Set if bit 39 of the result is set 
Z — Setif result equals zero 
V — Set if overflow has occurred in result 
C — Set if acarry (or borrow) occurs from bit 39 of the result 
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| 
DEC Decrement Accumulator D EC 


Instruction Format: 


DEC D (parallel move) 
Opcode: 


15 12 11 8 7 4 3 0 


1 m R R}JH H H W]}O 1 17 O;F 0 1 =O 


Instruction Fields: Please see the “X Memory Data Move” description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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| 
DEC24 Decrement 24 MS-bit of Accumulator DEC24 


Operation: Assembler Syntax: 


D2:D1-1 — D2:D1 (parallel move); DEC24 D (parallel move) 
DO is unchanged 


Description: Decrement by one the 24 MS bits of the destination accumulator. 


Example: 
DEC24 A X:(B1),X1 ;Decrement 24 MS bit of A; update X1 
Before Execution After Execution 
12 3456 789A 12 3455 789A 
A2 Al AO A2 Al AO 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $12:3456:789A. 
Execution of the DEC24 A instruction decrements by one the 24 MS bit of the accumulator 
A. 


Condition Codes Affected: 


MR CCR 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


— Computed according to the standard definition (see section A.4) 
— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of the result is in use 

— Set if result is unnormalized 

Set if bit 39 of the result is set 

— Set if the 24 most significant bit of the result are all zeroes 

— Set if overflow has occurred in result 

— Set if a carry (or borrow) occurs from bit 39 of the result 


QO<NZCmMro 
| 
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| 
DEC24 Decrement 24 MS-bit of Accumulator DEC24 


Instruction Format: 


DEC24 D (parallel move) 
Opcode: 


15 12 11 8 7 4 3 0 


marin www tt ofr ot: 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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| 
DIV Divide Iteration DIV 


Assembler Syntax: 


DIV S,D (parallel move) 
Operation: 
lf D[89] © S[15] =1 then 


el el 


D2 D1 DO 
else 

<— <—_ <— _ |«—_C; D1-S—>5D1 
D2 D1 DO 


Description: Divide the destination operand D (dividend) by the source operand S (divisor) and store the 
result in the destination accumulator D. The 32-bit dividend must be a positive fraction 
which has been sign extended to 40-bits and is stored in the full 40-bit destination 
accumulator D. The 16-bit divisor is a signed fraction and is stored in the source op- 
erand S. Each DIV iteration calculates one quotient bit using a nonrestoring fractional divi- 
sion algorithm (see the description on the next page). After execution of the first DIV in- 
struction, the destination operand holds both the partial remainder and the formed quotient. 
The partial remainder occupies the high order portion of the destination accumulator D and 
is a signed fraction. The formed quotient occupies the low order portion of the destination 
accumulator D (AO or BO) and is a positive fraction. One bit of the formed quotient is shifted 
into the LSB of the destination accumulator at the start of each DIV iteration. The formed 
quotient is the true quotient if the true quotient is positive. If the true quotient is negative, 
the formed quotient must be negated. Valid results are obtained only when |D| < |S| and 
the operands are interpreted as fractions. Note that this condition ensures that the mag- 
nitude of the quotient is less than one (i.e., is fractional) and precludes division by zero. 
The DIV instruction calculates one quotient bit based on the divisor and the previous partial 
remainder. To produce an N-bit quotient, the DIV instruction is executed N times where N 
is the number of bits of precision desired in the quotient, 1< N<16. Thus, for a full precision 
(16 bit) quotient, 16 DIV iterations are required. In general, executing the DIV instruction N 
times produces an N-bit quotient and a 32-bit remainder which has (32 - N) bits of precision 
and whose N MS bits are zeros. The partial remainder is not a true remainder and must be 
corrected due to the nonrestoring nature of the division algorithm before it may be used. 
Therefore, once the divide is complete, it is necessary to reverse the last DIV operation and 
restore the remainder to obtain the true remainder. 


The DIV instruction uses a nonrestoring fractional division algorithm which consists of the following opera- 
tions: 
1. Compare the source and destination operand sign bits: An exclusive OR operation is performed on 
bit 39 of the destination operand D and bit 15 of the source operand S; 
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NP 


SY 


DIV 


2. Shift the partial remainder and the quotient: the 40-bit destination accumulator D is shifted one bit 
to the left. The carry bit C is moved into the LSB (bit 0) of the accumulator; 


Divide Iteration 


DIV 


Calculate the next quotient bit and the new partial remainder: The 16-bit source operand S (signed 
divisor) is either added to, or subtracted from, the MSP portion of the destination accumulator (A1 
or B1) and the result is stored back into the MSP portion of that destination accumulator. If the result 
of the exclusive OR operation described above was a “1” (i.e., the sign bits were different), the 
source operand S is added to the accumulator. If the result of the exclusive OR operation was a “0” 
(i.e., the sign bits were the same), the source operand S is subtracted from the accumulator. Due 
to the automatic sign extension of the 16-bit signed divisor, the addition or subtraction operation 
correctly sets the carry bit C of the condition code register with the next quotient bit. 


Example: (4 Quadrant division, 16-bit signed quotient, 32-bit signed remainder) 


SAVEQ 


DONE 


Explanation of Example: 


smake dividend positive, copy A1 to B1 
save rem. sign in X:$0 

quotient sign in N bit of CCR 

;clear carry bit C (quotient sign bit) 
form a 16-bit quotient 

form quotient in AO, remainder in A1 
;save quotient and remainder in B1,BO 
;go to SAVEQ if quotient is positive 
;complement quotient if N bit set 

;save quotient in Y1, get signed divisor 
;get absolute value of signed divisor 
restore remainder in B1 

stest sign of remainder 

;go to DONE if remainder is positive 
;clear LS 16 bits of B 

;complement remainder if negative 


After Execution 


ABS A A,B 

MOVE B,X:$0 

EOR YO,B 

ANDI #$FE,CCR 

REP #$10 

DIV YO,A 

TFR A,B 

JPL SAVEQ 

NEG B 

TFR YO,B BO,Y1 

ABS B 

ADD A,B 

BFTSTL #$8000,X:$0 

BCS DONE 

MOVE #$0,BO 

NEG B 

Before Execution 

00 OE66 D7F2 

A2 Al AO 
0000 1234 

Y1 YO 
00 0000 0000 
B2 B1 BO 


00 121E 6544 
A2 Al AO 

6544 1234 
Y1 YO 

00 2452 6544 
B2 Bi BO 


Prior to execution, the 40-bit A accumulator contains the 40-bit, sign extended 


fractional dividend D (D = $00:0E66:D7F2 = 0.112513535656035 (approx.)) and the 16-bit 
YO register contains the 16-bit, signed fractional divisor S (S = $1234 = 0.1422119). Since 
|D| < |S], the execution of the divide routine given above stores the correct 16-bit signed 
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DIV Divide Iteration DIV 


quotient in the 16-bit Y1 register (A/YO = 0.7911072 = $6544 = Y1). The partial remainder 
is restored by reversing the last DIV operation and adding back the absolute value of the 
signed divisor in YO to the partial remainder in A1. This produces the correct LS16 bits of 
the 32-bit signed remainder in the 16-bit B1 register. Note that the remainder is really a 32- 
bit value which has 16 bits of precision. Thus, the correct 32-bit remainder is $0000:2452 
which is approximately 0.000004329718649. 


Note: The divide routine used in the example above assumes that the sign extended 40-bit signed frac- 
tional dividend is stored in the A accumulator and that the 16-bit signed fractional divisor is stored 
in the YO register. This routine produces a full 16-bit signed quotient and a 32-bit signed remainder. 
This routine may be greatly simplified for the case in which only unsigned operands are used to pro- 
duce a 16-bit positive quotient and a 32-bit positive remainder, as shown below. 


1 Quadrant division, 16-bit unsigned quotient, 32-bit unsigned remainder 


ANDI #$FE,CCR ;clear carry bit C (quotient sign bit) 
REP #$10 form a 16-bit quotient and remainder 
DIV X0,A form quotient in AO, remainder in A1 
ADD X0,A srestore remainder in A1 


This last routine assumes that the 40-bit positive, fractional, sign extended dividend is stored in the 
A accumulator and that the 16-bit positive, fractional divisor is stored in the XO register. After exe- 
cution, the 16-bit positive fractional quotient is stored in the AO register while the LS 16-bits of the 
32-bit positive fractional remainder are stored in the A1 register. 
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DIV Divide Iteration DIV 


Condition Codes Affected: 


l< MR cay CCR >| 
15 14 13 12 1110 9 8|}7 6 5 4 3 2 1 =O 


LF) *) *|] *|S1)/So0) 11) 10) S|] L|] E] UJ N] Z| vic 


L — Set if overflow bit V is set 

V — Set if the MS bit of the destination operand is changed as a result of the 
instruction’s left shift operation 

C — Set if bit 39 of the result is cleared 


Instruction Format: 


DIV S,D (parallel move) 
Opcode: 
15 12 11 8 7 4 3 0 
0001/0 1014/0 —— O|JF 1 =D OD 
“—” = don’t care 


Instruction Fields: 


S,D | DD F| SD |/|DD F 

X0,A |} 00 0} X1,A }10 0 

X0,B | 00 1/} X1,B )10 1 

YOA |01 0} Y1,A | 11 0 

YO,B |01 1} Y1, 11 1 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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DMAC Double (Multi) Precision DMAC 


Multiply-Accumulate with 16-bit Right Shift 


Operation: Assembler Syntax: 
$1*S2+[D>>16] —D (no parallel move) DMAC(ss,su,uu) $1,S2,D (no parallel move) 


Description: Multiply the two 16-bit source operands S1 and S2 and add the product to the destination 
accumulator D which has been previously shifted 16 bits to the right. The multiplication can 
be performed on signed numbers (ss), unsigned numbers (uu), or mixed (unsigned x 
signed, (su)) numbers. This instruction is optimized for multiprecision multiplication sup- 


port. 
Example: 
DMACsu Y1,X0,A = X0,A save A into XO before decrementing it 
Before Execution After Execution 

12 3456 789A 00 OOEO 3388 

A2 Al AO A2 Al AO 
FFFF FFFF 

X0 X0 

Y1 Y1 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $12:3456:789A. 
Execution of the DMACsu Y1,X0,A multiplies the 16-bit signed value in Y1 by the 16-bit un- 
signed value in XO, adds the result of the product to the accumulator A after A has been 
shifted right and writes the final result in the accumulator A. 


Warning: The saturation mode is ALWAYS disabled during execution of DMAC, even when the sat- 
uration bit (SA) of the OMR is set. Refer to Section 5.8.3 for more details. 


Condition Codes Affected: 


I< MR > CCR > 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *;} * | *|S1/S0/ 11/10; S;L)/E;)U;)N)Z) vic 


— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of the result is in use 

— Set if result is unnormalized 

— Set if bit 39 of the result is set 

— Set if result equals zero 

— Set if overflow has occurred in result 

— Set if a carry (or borrow) occurs from bit 39 of the result 


O<N2ZCMr 
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NCL 


DMAC Double (Multi) Precision DMAC 


Multiply-Accumulate with 16-bit Right Shift 


Instruction Format: 


DMAC(ss,su,uu) $1,S2,D (no parallel move) 
Opcode: 


15 12 11 8 7 4 3 0 


000 1/0 10 1;)/1 0 s 1/F s Q Q 


Instruction Fields: 


$1,S2,D/ QQ F/S$1,S2,D ,;QQ F 


Arithmetic 

YO,X0,A|00 0O/X1,Y0,A {10 O 
ss YO,X0,B;00 1/X1,Y0,B {10 1 
su Y1,X0,A;}01 0O/}X1,Y1,A |11 O 
uu Y1,X0,B;} 01 1/X1,Y1,B {11 1 


Note: For DMACsu, the order of S1, S2 is 
significant; S1 will always be the signed op- 
erand (i.e., YO,Y1, X1). 


“—” = don’t care 

Timing: 2 oscillator clock cycles 

Memory: 1 program word 
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DO Start Hardware Do Loop DO 
Operation: Assembler Syntax: 
SP+1—SP; LA -SSH; LC-SSL; X:<ea> >LC DO X:(Rn),expr 
SP+1—SP; PC->SSH; SR-SSL; offset-1+PC—LA 

1— LF 

SP+1 — SP; LA > SSH; LC SSL; #xx > LC DO #xx,expr 
SP+1—SP; PC->SSH; SR-SSL; offset-1+PC—LA 

1— LF 

SP+1— SP; LA > SSH; LC> SSL; S > LC DO S,expr 
SP+1—SP; PC->SSH; SR->SSL; offset-1+PC—LA 

1 LF 

End of Loop: 


SSL(LF) > SR; SP-1 > SP 
SSH > LA; SSL > LC; SP-1 > SP 


Description: 


Begin a hardware DO loop that is to be repeated the number of times specified in the in- 
struction’s source operand and whose range of execution is terminated by the destination 
operand (shown above as “expr”). No overhead other than the execution of this DO instruc- 
tion is required to set up this loop. DO loops can be nested and the loop count can be 
passed as a parameter. During the first instruction cycle, the current contents of the Loop 
Address (LA) and the Loop Counter (LC) registers are pushed onto the system stack. The 
DO instruction’s source operand is then loaded into the Loop Counter (LC) register. The LC 
register contains the remaining number of times the DO loop will be executed and can be 
accessed from inside the DO loop subject to certain restrictions. If LC equals zero, the DO 
loop is not executed. If immediate short data is specified, the 8 LS bits of LC are loaded 
with the 8-bit immediate value and the eight MS bits of LC are cleared. 


During the second instruction cycle, the current contents of the Program Counter (PC) reg- 
ister and the Status Register (SR) are pushed onto the system stack. Stacking LA, LC, PC, 
and SR permits nesting DO loops. The DO instruction’s destination address (shown as off- 
set which is derived from “expr’) is then loaded into the Loop Address (LA) register after 
having been added to the PC. This 16-bit operand is located in the instruction’s 16-bit rel- 
ative address extension word as shown in the opcode section. The value in the Program 
Counter (PC) register pushed onto the system stack is the address of the first instruction 
following the DO instruction (i.e., the first actual instruction in the DO loop). This value is 
read (i.e., copied but not pulled) from the top of the system stack to return to the top of the 
loop for another pass through the loop. 


During the third instruction cycle, the Loop Flag (LF) is set. This results in the PC being re- 
peatedly compared with LA to determine if the last instruction in the loop has been fetched. 
If LA equals PC, the last instruction in the loop has been fetched and the Loop Counter (LC) 
is tested. If LC is not equal to one, it is decremented by one and SSH is loaded into the PC 
to fetch the first instruction in the loop again. If LC equals one, the “end of loop” processing 
begins. 
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DO Start Hardware Do Loop DO 


When executing a DO loop, the instructions are actually fetched each time through the loop. 
Therefore, a DO loop can be interrupted. DO loops can also be nested. When DO loops are 
nested, the end of loop addresses must also be nested and are not allowed to be equal. 
The assembler generates an error message when DO loops are improperly nested. Nested 
DO loops are illustrated in the example. 


Note: The assembler determines the offset needed to calculate the address to be loaded into LA at exe- 
cution time. This offset is calculated by evaluating the end of loop expression “expr” and subtracting 
the address of the next instruction following the DO instruction. This is done to accommodate the 
case where the last word in the DO loop is a two word instruction. Thus, the end of loop expression 
“expr” in the source code must represent the address of the instruction AFTER the last instruction 
in the loop as shown in the example. 


During the “end of loop” processing, the Loop Flag (LF) from the lower portion (SSL) of SP 
is written into the Status Register (SR), the contents of the Loop Address (LA) register are 
restored from the upper portion (SSH) of SP-1, the contents of the Loop Counter (LC) are 
restored from the lower portion (SSL) of SP-1 and the Stack Pointer (SP) is decremented 
by two. Instruction fetches now continue at the address of the instruction following the last 
instruction in the DO loop. Note that LF is the only bit in the Status Register (SR) that is 
restored after a hardware DO loop has been exited. 


Note: The Loop Flag (LF) is cleared by a hardware reset. 


Restrictions: The “end of loop” comparison described above actually occurs at instruction fetch time. That 
is, LA is being compared with PC when the instruction at LA-2 is being executed. Therefore, instructions 
which access the program controller registers and/or change program flow cannot be used in locations LA- 
2, LA-1, or LA. 


Proper DO loop operation is not guaranteed if an instruction starting at address LA-2, LA-1, or LA specifies 
one of the program controller registers SR, SP, SSL, LA, LC, or (implicitly) PC as a destination register. Sim- 
ilarly, the SSH program controller register may not be specified as a source or destination register in an in- 
struction starting at address LA-2, LA-1, or LA. Additionally, the SSH register cannot be specified as a 
source register in the DO instruction itself and LA cannot be used as a target for jumps to subroutine (i.e., 
BSR, JSR, BScc, or JScc to LA). A DO instruction cannot be repeated using the REP instruction. 
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DO Start Hardware Do Loop DO 


The following instructions cannot begin at the indicated position(s) near the end of a DO loop: 


At LA-2, LA-1 and LA DO 
MOVEC from SSH 
MOVEC to LA, LC, SR, SP, SSH or SSL 
ANDI MR 
ORI MR 
Two word instructions which read LC, SP, or SSL 


At LA-1 ENDDO, BRKcc 
Single word instructions which read LC, SP, or SSL 


AtLA any two-word instruction” RESET 
Bcc, Jcc RTI 
BRA, JMP RTS 
BScc, JScc STOP 
BSR, JSR WAIT 
REP, REPcc 


*This restriction applies to the situation in which the DSP Simulator’s single line assembler is used to change 
the last instruction in a DO loop from a one-word instruction to a two-word instruction. 


Other Restrictions DO SSH,xxxx 
BSR, JSR to (LA) whenever the Loop Flag (LF) is set 
BScc, JScc to (LA) whenever the Loop Flag (LF) is set 


A DO instruction cannot be repeated using the REP instruction. 


Notes: Due to pipelining, if an address register (RO-R8, NO-N3 or MO-M8) is changed using a move-type 
instruction (LUA, Tcc, MOVE, MOVEC, MOVER, or parallel move), the new contents of the desti- 
nation address register will not be available for use during the following instruction (i.e., there is a 
single instruction cycle pipeline delay). This restriction also applies to the situation in which the last 
instruction in a DO loop changes an address register and the first instruction at the top of the DO 
loop uses that same address register. The top instruction becomes the following instruction be- 
cause of the loop construct. 
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DO Start Hardware Do Loop DO 


Similarly, since the DO instruction accesses the program controller registers, the DO instruction must not 
be immediately preceded by any of the following instructions: 


Immediately before DO MOVEC to LA, LC, SSH, SSL or SP 
MOVEC from SSH 

Example: 

DO #cnt1, END1 begin outer DO loop 

DO #cnt2, END2 begin inner DO loop 

MOVE A,X:(RO)+ slast instruction in inner loop 
END2 : ;(in outer loop) 

ADD A,B X:(R1)+,X0 slast instruction in outer loop 
END1 : sfirst instruction after outer loop 


Explanation of Example: This example illustrates a nested DO loop. The outer DO loop will be executed 
“cnt1” times while the inner DO loop will be executed (“cnt1” * “cnt2”) times. Note that the 
labels END1 and END2 are located at the first instruction past the end of the DO loop, as 
mentioned above, and are nested properly. 


Condition Codes: 


i< MR can CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF} *; */ */S1/S0O/ 11), 10;S;/L)/E;U;N{| Z/ VIC 


LF — Set when a DO loop is in progress 
L — Set if data limiting occurred 


Note: If Aor Bis specified as a source operand, the accumulator value is optionally shifted according to 
the scaling mode bits in the status register. If the data out of the shifter indicates that the accumu- 
lator extension is in use, the 16-bit data is limited to a maximum positive or negative saturation con- 
stant. The shifted and limited value is loaded into LC, although A or B remain unchanged. 
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DO Start Hardware Do Loop DO 


Instruction Format and Opcode: 


DO X:(Rn), expr 
15 12 11 8 7 4 3 0 
RR Rn 
0 00 0;/0 0 0 0/1 1 :0—)}—— RR 
00 RO 
01 R1 
Relative Address Displacement Extension 10 R2 
11 R3 
“—” = don’t care 
DO #Xx, expr 
15 12 11 8 7 4 3 0 
coe ola. 7 4% Olah We We A we G iii = immediate 8-bit 
short data = iiliiiii 
Relative Address Displacement Extension 
A- INSTRUCTI E M 
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DO Start Hardware Do Loop DO 


DO S,expr 
1 12 11 8 


5 7 4 3 0 
0 00 0/0 10 0/0 00 D;/D D D D 


Relative Address Displacement Extension 


S |DDDDD |S DDDDD |S DDDDD |S DDDDD 


X0 |}00000 |SR 01001 |RO 
YO }00001 |}OMR|01010 |R1 
X1}00010 |SP 01011 |R2 
Yi };00011 |At 01100 |R3 
A |00100 |B1 01101 |MO 
B |}00101 |A2 01110 |M1 
AO }00110 |B2 01111 |M2 
BO |}00111 M3 


0000 |}SSH|;11000 
0001 ;/SSL/11001 
0010 |LA 11010 
0011 |;LC 01000 
0100 |NO 11100 
0101 |N1 11101 
0110 |N2 11110 
0111 |N3 11111 


ce ee ee Se ee Cee Cee Cee 


Note: * ForDO SP, expr 
The actual value that will be loaded into the Loop Counter (LC) is the value of the Stack Pointer 
(SP) before the execution of the DO instruction, incremented by one. Thus, if SP = 3, the execu- 
tion of the DO SP, expr instruction will load the Loop Counter (LC) with the value LC = 4. 


For DO SSL, expr 
The Loop Counter (LC) will be loaded with its previous value which was saved on the stack by 
the DO instruction itself. 


If A or B is specified as a source operand, the accumulator value is optionally shifted according 
to the scaling mode bits in the status register. If the data out of the shifter indicates that the accu- 
mulator extension is in use, the 16-bit data is limited to a maximum positive or negative saturation 
constant. The shifted and limited value is loaded into LC, although A or B remain unchanged. 


Instruction Field for the second word: 


expr = 16-bit PC Relative Address 


Timing: 10 + mv oscillator clock cycles if the DO argument equals zero; 
otherwise it is 6 + mv oscillator clock cycles 
Memory: 2 program words 
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DO FOREVER Start Infinite Loop DO FOREVER 


Operation: Assembler Syntax: 
SP+1—SP; LA ~SSH; LC->SSL DO FOREVER expr 
SP+1—SP; PC-SSH; SR-SSL; expr-1+PC—LA 

1— LF; 13FV 


Description: Begin a hardware DO loop that is to be repeated for ever and whose range of execution is 
terminated by the destination operand (shown above as “expr’). No overhead other than 
the execution of this DO FOREVER instruction is required to set up this loop. DO FOREV- 
ER loops can be nested. During the first instruction cycle, the current contents of the Loop 
Address (LA) and the Loop Counter (LC) registers are pushed onto the system stack. The 
loop counter (LC) register is pushed onto the stack but is not updated by this instruction. 


During the second instruction cycle, the current contents of the Program Counter (PC) reg- 
ister and the Status Register (SR) are pushed onto the system stack. Stacking the LA, LC, 
PC, and SR registers permits nesting DO FOREVER loops. The DO FOREVER instruc- 
tion’s destination operand (shown as “expr”) is then loaded into the Loop Address (LA) reg- 
ister after having been added to the PC. This 16-bit operand is located in the instruction’s 
16-bit relative address extension word as shown in the opcode section. The value in the 
Program Counter (PC) register pushed onto the system stack is the address of the first in- 
struction following the DO FOREVER instruction (i.e., the first actual instruction in the DO 
FOREVER loop). This value is read (i.e., copied but not pulled) from the top of the system 
stack to return to the top of the loop for another pass through the loop. 


During the third instruction cycle, the Loop Flag (LF) and the ForeVer flag are set. This re- 
sults in the PC being repeatedly compared with LA to determine if the last instruction in the 
loop has been fetched. If LA equals PC, the last instruction in the loop has been fetched 
and SSH is loaded into the PC to fetch the first instruction in the loop again. The loop 
counter (LC) register is then decremented by one without being tested. This register can be 
used by the programer to count the number of loops already executed. 


When executing a DO FOREVER loop, the instructions are actually fetched each time 
through the loop. Therefore, aDO FOREVER loop can be interrupted. DO FOREVER loops 
can also be nested. When DO FOREVER loops are nested, the end of loop addresses must 
also be nested and are not allowed to be equal. The assembler generates an error mes- 
sage when DO FOREVER loops are improperly nested. Nested DO loops with one DO 
FOREVER loop are illustrated in the example. 


Note: The assembler determines the offset needed to calculate the address to be loaded into LA at exe- 
cution time. This offset is calculated by evaluating the end of loop expression “expr” and subtracting 
the address of the next instruction following the DO instruction. This is done to accommodate the 
case where the last word in the DO FOREVER loop is a two word instruction. Thus, the end of loop 
expression “expr” in the source code must represent the address of the instruction after the last 
instruction in the loop as shown in the example. 
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DO FOREVER Start Infinite Loop DO FOREVER 


The loop counter (LC) register is never tested by the DO FOREVER instruction and the only way of 
terminating the loop process is to use either the ENDDO or BRKcc instructions. LC is decremented 
every time PC=LA so that it can be used by the programmer to keep track of the number of times 


the DO FOREVER loop has been executed. If the programer wants to initialize LC to a particular 
value before the DO FOREVER, care should be taken to save it before if the DO loop is nested. If 
so, LC should also be restored immediately after exiting the nested DO FOREVER loop. 


Restrictions: The “end of loop” comparison described above actually occurs at instruction fetch time. That 
is, LA is being compared with PC when the instruction at LA-2 is being executed. Therefore, instructions 
which access the PCU registers and/or change program flow cannot be used in locations LA-2, LA-1 or LA. 


Proper DO FOREVER loop operation is not guaranteed if an instruction starting at address LA-2, LA-1, or 
LA specifies one of the program control unit registers SR, SP, SSL, LA, or (implicitly) PC as a destination 
register. Similarly, the SSH register may not be specified as a source or destination register in an instruction 
starting at address LA-2, LA-1, or LA. Additionally, the SSH register cannot be specified as a source register 
in the DO FOREVER instruction itself and LA cannot be used as a target for jumps to subroutine (i.e., BSR, 
JSR, BScc, or JScc to LA). ADO FOREVER instruction cannot be repeated using the REP instruction. 


The following instructions cannot begin at the indicated position(s) near the end of a DO FOREVER loop: 


At LA-2, LA-1, and LA DO 
MOVEC from SSH 
MOVEC to LA, SR, SP, SSH or SSL 
ANDI MR 
ORI MR 
Two word instructions which read SP, or SSL 


At LA-1 ENDDO, BRKcc 
Single word instructions which read SP, or SSL 


AtLA Any two-word instruction* RESET 
Bcc, Jcc RTI 
BRA, JMP RTS 
BScc, JScc STOP 
BSR, JSR WAIT 
REP, REPcc 


*This restriction applies to the situation in which the DSP Simulator’s single line assembler is used to change 
the last instruction ina DO FOREVER loop from a one-word instruction to a two-word instruction. 


Other Restrictions BSR, JSR to (LA) whenever the Loop Flag (LF) is set 
BScc, JScc to (LA) whenever the Loop Flag (LF) is set 
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DO FOREVER Start Infinite Loop DO FOREVER 


Note: Due to pipelining, if an address register (RO-R3, NO-N3 or MO-M3) is changed using a move-type 
instruction (LEA, Tcc, MOVE, MOVEC, or parallel move), the new contents of the destination ad- 
dress register will not be available for use during the following instruction (i.e., there is a single in- 
struction cycle pipeline delay). This restriction also applies to the situation in which the last instruc- 
tion ina DO loop changes an address register and the first instruction at the top of the DO loop uses 
that same address register. The top instruction becomes the following instruction because of the 


loop construct. 


Similarly, since the DO instruction accesses the PCU registers, the DO instruction must not be immediately 
preceded by any of the following instructions: 


Immediately before DO MOVEC to LA, SSH, SSL or SP 


Example: 
DO 


DO 


BEQ 

ENDDO 

ENDDO 

BRA 
REM 


BRKNN 


MOVE 
END2 

ADD 
END1 


MOVEC from SSH 


#cnt1, END1 begin outer DO loop 
FOREVER,END2 ;begin inner DO loop 
REM 


-ENDDO if not EQ 
;ENDDO for leaving outer loop 
END1 ;Branch to (END1) out of upper loop 


;conditional exit of DO FOREVER; branch to END2 exiting 


; loop 
A,X:(RO)+ slast instruction in inner loop 
; sfirst instruction in outer loop 
A,B X:(R1)+,X0 slast instruction in outer loop 


sfirst instruction after outer loop 


Explanation of Example: This example illustrates a nested DO loop with one DO FOREVER loop. The 
outer DO loop will be executed “cnt1” times while the inner DO FOREVER loop will be ex- 
ecuted till the ENDDO or BRKNN are executed. Note that the labels END1 and END2 are 
located at the first instruction past the end of the DO loop, as mentioned above, and are 
nested properly. 
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NOL 


DO FOREVER Start Infinite Loop DO FOREVER 


Condition Codes Affected: 


I< MR cau CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF} *; */ */S1/S0O/ 11/10; S|} LJ E;USN]| Z/ Vic 


LF — Set when a DO loop is in progress 


Instruction Format: 


DO FOREVER expr 


Opcode: 

15 12 11 8 7 4 3 0 

0 00 0/0 00 0/0 0 0 0/0 0 1 +0 

Relative Address Displacement Extension 
Timing: 6 oscillator clock cycles 
Memory: 2 program words 
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ENDDO End Current DO Loop ENDDO 


Operation: 


Assembler Syntax: 


SSL(LF,FV) > SR; SP-1 — SP ENDDO 
SSH > LA; SSL > LC; SP-1 > SP 


Description: 


Terminate the current hardware DO loop before the current loop counter (LC) equals one. 
It also terminates the DO FOREVER loop. If the value of the current DO loop counter (LC) 
is needed, it must be read before the execution of the ENDDO instruction. Initially, the loop 
flag (LF) and the ForeVer flag (FV) are restored from the system stack and the remaining 
portion of the status register (SR) and the program counter (PC) are purged from the sys- 
tem stack. The loop address (LA) and the loop counter (LC) registers are then restored from 
the system stack. 


Resirictions: Due to pipelining and the fact that the ENDDO instruction accesses the program controller 
registers, the ENDDO instruction must not be immediately preceded by any of the following instructions: 


Immediately before ENDDO MOVEC to LA, LC, SR, SSH, SSL or SP 


MOVEC from SSH 
ORI MR 
ANDI MR 


Also, the ENDDO instruction cannot be the next to last instruction in a DO loop (at LA-1). 


Example: 
DO YO,NEXT exec. loop ending at NEXT (YO) times 
MOVEC LC,A ;get current value of loop counter (LC) 
CMP Y1,A ;compare loop counter with value in Y1 
JNE ONWARD ;go to ONWARD if LC not equal to Y1 
ENDDO ;LC equal to Y1, restore all DO registers 
JMP NEXT ;go to NEXT 

ONWARD : ;LC not equal to Y1, continue DO loop 

: ;(last instruction in DO loop) 
NEXT MOVE #$123456,X1 (first instruction AFTER DO loop) 


Explanation of Example: This example illustrates the use of the ENDDO instruction to terminate the cur- 


rent DO loop. The value of the loop counter (LC) is compared with the value in the Y1 reg- 
ister to determine if execution of the DO loop should continue. Note that the ENDDO in- 
struction updates certain program controller registers but does not automatically jump past 
the end of the DO loop. Thus, if this action is desired, a JMP/BRA instruction (i.e., JMP 
NEXT as shown above) must be included after the ENDDO instruction to transfer program 
control to the first instruction past the end of the DO loop. 


Condition Codes Affected: 


The condition codes are not affected by this instruction. 
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ENDDO End Current DO Loop ENDDO 


Instruction Format: 


ENDDO 
Opcode: 


15 12 11 8 7 4 3 0 


0 00 0/0 0 0 0/0 0 0 0;1 0 0 1 


Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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EO R Logical Exclusive OR EOR 


Operation: Assembler Syntax: 
S ® D[81:16] > D[31:16] (parallel move) EOR’ §,D (parallel move) 
Description: Logically Exclusive OR the source operand S with bits 31-16 of the destination operand D 


and store the result in bits 31-16 of the destination accumulator. This instruction is a 16-bit 
operation. The remaining bits of the destination operand D are not affected. 


Example: 
EOR Y1,B (R2)- ;Exclusive OR Y1 with B1, update R2 
Before Execution After Execution 
00 0005 6789 00 0006 6789 
B2 B1 BO B2 B1 BO 
0003 0003 
Y1 Y1 


Explanation of Example: Prior to execution, the 16-bit Y1 register contains the value $0003 and the 40- 
bit B accumulator contains the value $00:0005:6789. The EOR Y1,B instruction logically 
exclusive OR’s the 16-bit value in the Y1 register with bits 31-16 of the B accumulator (B1) 
and stores the 40-bit result in the B accumulator. Note that the lower word of the accumu- 
lator, BO, and the extension byte, B2, are not affected by the operation. 


Condition Codes Affected: 


MR CCR 
15 1413 12 1110 9 8}|7 6 5 4 3 2 1 #0 


LF}; *; * | *|S1/S0/ 11/10; S;/L)/E;)U)N)Z)V)C 


S — Computed according to the standard definition (see section A.4) 
L  — Set if data limiting has occurred during parallel move 
N — Setif bit 31 of A or B result is set 
Z — Set if bits 31-16 of A or B result are zero 
V — Always cleared 
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| 
EOR Logical Exclusive OR EOR 


Instruction Format: 


EOR S,D (parallel move) 
Opcode: 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


S,D |JJ F 
X0,A |00 0 
X0,B | 00 1 
YO,A |01 0 
YO,B |01 1 
X1,A | 10 0 
X1,B }10 1 
Y1,A | 11 0 
Y1,B }11 1 


Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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EXT 


Sign Extend Accumulator 


Assembler Syntax: 


EXT 


Operation: 
bit31o0ofD — [bit 39-32] of D EXT D (no parallel move) 
Description: Sign Extend the Destination accumulator from the most significant bit of the upper word (bit 
31 of D). The LS word of the destination accumulator is not affected. 

Example: 

EXT A 

A Before Execution A After Execution 
FF 6432 0000 00 6432 0000 
A2 Al AO A2 Al AO 


Explanation of Example: 


Prior to execution, the 40-bit A accumulator contains the value $FF:6432:0000. 


Since bit 31 of A is cleared, the execution of the EXT instruction clears the extension bits 
32-39 and returns $00:6432:0000 in A which is a positive value. 


Condition Codes Affected: 


<N2ZCM 


I< 


15 14 13 12 1110 9 8 


MR 


LF 


* * 


* 


$1} S0) 11} 10} S 


Always cleared 
Set according to the standard definition of the U bit 


Set if bit 39 of A or B result is set 
Set if A or B result equals zero 
Always cleared 
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EXT Sign Extend Accumulator EXT 


Instruction Format: 


EXT D 
Opcode: 


15 12 11 8 7 4 3 0 


0 0014/0 10 1/0 1:0 14;)/F 0 1 =0 


Instruction Fields: 


D F 
A 0 
B 1 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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ILLEGAL Illegal Instruction Interrupt ILLEGAL 


Operation: Assembler Syntax: 


Begin Illegal instruction exception routine ILLEGAL (no parallel move) 


Description: Normal instruction execution is suspended and Illegal Instruction exception processing is 
initiated. The interrupt priority level (I1, 10) is set to 3 in the status register if a long interrupt 
service routine is used. The purpose of the Illegal interrupt is to force the DSP into an illegal 
instruction exception for test purposes. If a fast interrupt is used with the ILLEGAL instruc- 
tion, an infinite loop will be formed (an illegal instruction interrupt normally returns to the il- 
legal instruction) which can only be broken by a hardware reset. Therefore, only long inter- 
rupts should be used. Exiting an ILLEGAL instruction is a fatal error, the long exception rou- 
tine should indicate this condition and cause the system to be restarted. 


If the ILLEGAL instruction is in a DO loop at LA and the instruction at LA-1 is being inter- 
rupted, then LC will be decremented twice due to the same mechanism that causes LC to 
be decremented twice if JSR, REP,... are located at LA. 


Since REP is uninterruptable, repeating an ILLEGAL instruction results in the interrupt not 
being taken until after completion of the REP. After servicing the interrupt, program control 
will return to the address of the second word following the ILLEGAL instruction. Of course, 
the ILLEGAL interrupt service routine should abort further processing, and the processor 
should be reinitialized. 


Example: 
ILLEGAL 
Explanation of Example: see above description. 


Condition Codes Affected: 


The condition codes are not affected by this instruction. 
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ILLEGAL Illegal Instruction Interrupt ILLEGAL 


Instruction Format: 


ILLEGAL 

Opcode: 

15 12 11 8 7 4 3 0 

0 00 0;0 00 0/0 0 0 0/1 1 1 ~=4 
Timing: 8 oscillator clock cycles 
Memory: 1 program word 
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IMAC Integer Multiply-Accumulate IMAC 


Operation: Assembler Syntax: 


(S1*S2+[D>>15])<15 > D2:D1; IMAC $1,S2,D (no parallel move) 
sign extend D2; leave DO unchanged 


Description: Integer Multiply the two 16-bit signed integer source operands S1 and S2 and add the prod- 
uct to the upper word (D1) of the destination accumulator D leaving the lower word (DO) 
unchanged. A 15-bit shift as opposed to a 16-bit shift is required because of the inherent 
fractional nature of the multiplier. This is discussed more fully in Section 3.2.3. 


Note: No overflow control or rounding are performed during integer multiply-accumulate instruc- 
tions. The result is always a 16-bit signed integer result which is sign extended to 24 bits. 
Example: 
MOVE RO,A initialize A 
IMAC YO,X0,A  ;updateA 
MOVE X:(A1),B — ; use Al as memory pointer 
Before Execution After Execution 
00 0008 789A 00 0014 789A 
A2 Al AO A2 Al AO 
0003 0003 
X0 X0 
0004 0004 
YO YO 


Explanation of Example: Prior to execution, the 16-bit accumulator register A1 contains a 16-bit signed 
integer value ($0008). The data ALU registers XO and YO contains respectively two 16-bit 
signed integer values $0003 and $0004. Execution of the IMAC X0,Y0,A instruction integer 
multiplies XO and YO and accumulates the result in A1. AO remains unchanged and A2 is 
sign extended. 
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IMAC Integer Multiply-Accumulate IMAC 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *; * | *|S1/S0;/ 11/10; S$; lL] E;U)N) Z) VIC 


E — Notdefined 

U — Notdefined 

N — Set if bit 39 of the result is set 

Z — Set ifthe 24 MS bits of the result equal zero 


Instruction Format: 


IMAC $1,S2,D 
Opcode: 


15 12 11 8 7 4 3 0 


000 1/0 10 14;/1 0 1 0/F QQ Q 


Instruction Fields: 


$1,S2,D |QQQ_ F |S1,S2,D |QQQ F 
X0,X0,A |000 0 /Y0,X0,A ;}100 0 
X0,X0,B |000 1 /Y0,X0,B }100 1 
X1,X0,A |001 O/Y1,X0,A }/101 0 
X1,X0,B |001 1 /Y1,X0,B}/101 1 
A1,Y0,A }010 0 }Y0,X1,A };110 0 
A1,Y0,B }010 1 /Y0,X1,B);110 1 
B1,X0,A }011 O/Y1,X1,A/111 O 
B1,X0,B }011 1 /Y1,X1,B/111 = 1 

Timing: 2 oscillator clock cycles 

Memory: 1 program word 
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IMPY Integer Multiply IMPY 


Operation: Assembler Syntax: 


(S1*S2)<15 >  D2:D1; IMPY $1,S2,D (no parallel move) 
sign extend D2; leave DO unchanged 


Description: Integer Multiply the two 16-bit signed integer source operands S1 and S2 and store the 
product in the upper word (D1) of the destination accumulator D leaving the lower word (DO) 
unchanged. 

Note: No overflow control or rounding are performed during integer multiply instructions. The re- 
sult is always a 16-bit signed integer result which is sign extended to 24 bits. 


Example: 
IMPY YO,X0,A __; form product 
MOVE A1,RO ; initialize pointer 
Before Execution After Execution 

00 0008 789A 00 000C 789A 

A2 Al AO A2 Al AO 
0003 0003 
XO XO 
0004 0004 
YO YO 


Explanation of Example: Prior to execution, the 16-bit accumulator register A1 contains a 16-bit signed 
integer value ($0008). The data ALU registers XO and YO contain respectively two 16-bit 
signed integer values $003 and $004. Execution of the IMPY X0,Y0,A instruction integer 
multiplies XO and YO and stores the result $C in A1. AO remains unchanged and A2 is sign 
extended. 


A-102 INSTRUCTION SET MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


| 
IMPY Integer Multiply IMPY 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *; * | *|S1/S0;/ 11/10; S$; lL] E;U)N) Z) VIC 


E — Notdefined 

U — Notdefined 

N — Set if bit 39 of the result is set 

Z — Set ifthe 24 MS bits of the result equal zero 


Instruction Format: 


IMPY $1,S2,D 
Opcode: 


15 12 11 8 7 4 3 0 


000 1/0 10 1;/1 00 0/F QQ Q 


Instruction Fields: 


$1,S2,D |QQQ_ F |S1,S2,D |QQQ F 
X0,X0,A |000 0 /Y0,X0,A ;}100 0 
X0,X0,B |000 1 /Y0,X0,B }100 1 
X1,X0,A |001 O/Y1,X0,A }/101 0 
X1,X0,B |001 1 /Y1,X0,B}/101 1 
A1,Y0,A }010 0 }Y0,X1,A };110 0 
A1,Y0,B }010 1 /Y0,X1,B);110 1 
B1,X0,A }011 O/Y1,X1,A/111 O 
B1,X0,B }011 1 /Y1,X1,B/111 = 1 

Timing: 2 oscillator clock cycles 

Memory: 1 program word 
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INC Increment Accumulator INC 


Operation: Assembler Syntax: 


D+1 —>D (parallel move) INC D (parallel move) 


Description: Increment by one the destination accumulator. This is a 40-bit increment instruction. 


Example: 
INC A A, X0 save A into XO before incrementing it 
Before Execution After Execution 
12 3456 789A 12 3456 789B 
A2 Al AO A2 Al AO 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $12:3456:789A. 
Execution of the INC A instruction increments by one the 40-bit A accumulator. 


Condition Codes Affected: 


MR CCR 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 #0 


S — Computed according to the standard definition (see section A.4) 
L — Set if limiting (parallel move) or overflow has occurred in result 
E — Set if the signed integer portion of the result is in use 
U — Set if result is unnormalized 
N — Set if bit 39 of the result is set 
Z — Set if result equals zero 
V — Set if overflow has occurred in result 
C — Set if acarry (or borrow) occurs from bit 39 of the result 
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INC Increment Accumulator INC 


Instruction Format: 


INC D (parallel move) 
Opcode: 


15 12 11 8 7 4 3 0 


imar|y wwe ot olf oto 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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INC24 Increment 24 MS-bit of Accumulator INC24 


Operation: Assembler Syntax: 


D2:D1+1 — D2:D1 (parallel move); INC24 D (parallel move) 
DO is unchanged 


Description: Increment by one the 24 MS bit of the destination accumulator. 


Example: 
INC24 A X:(B1),X1 Increment 24 MS bits of A; update X1 
Before Execution After Execution 
12 3456 789A 12 3457 789A 
A2 Al AO A2 Al AO 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $12:3456:789A. 
Execution of the INC24 A instruction increments by one the 24 MS bits of the accumulator 
A. 


Condition Codes Affected: 


< MR >* CCR | 
15 14 13 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *|} * | *|S1/S0/ 11/10; S;/L)/E;)U;)N)Z)Vvic 


— Computed according to the standard definition (see section A.4) 
— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of the result is in use 

— Set if result is unnormalized 

Set if bit 39 of the result is set 

— Set if the 24 most significant bit of the result are all zeroes 

— Set if overflow has occurred in result 

— Set if a carry (or borrow) occurs from bit 39 of the result 


QO<NZCMrO@ 
| 
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INC24 Increment 24 MS-bit of Accumulator INC24 


Instruction Format: 


INC24 D (parallel move) 
Opcode: 


15 12 11 8 7 4 3 0 


marin wwe ot off ot: 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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Jcc 


Operation: 


If cc, then label 
else PC+1 


If cc, then Rn 
else PC+1 


Description: 


Jump Conditionally Jcc 


Assembler Syntax: 


—> PC Jcc XXXX 
—> PC 
—> PC Jcc (Rn) 
—> PC 


If the specified condition is true, program execution continues at the effective address spec- 
ified in the instruction. If the specified condition is false, the program counter (PC) is incre- 
mented and program execution continues sequentially. Long displacement (16-bit signed 
value) and address register addressing modes may be used. 


The term “cc” may specify the following conditions: 


Condition 


— carry clear (higher or same) 

— carry set(lower) 

— extension clear 

— equal 

— extension set 

— greater than or equal N © V=0 


— greater than 


— limit clear 
— less than or equal 


Z+(N @ V)=0 
L=0 
Z+(N @ V)=1 


rc 
Il 


— limit set 

— less than 

— minus 

— not equal 

— normalized 

— plus 

— not normalized 


Zz 
<-t 
un 


ll 
Sek 


N 
es 


AZAaNZo 
T 
—_— 


OT 
mo mi 


N 
a 
i 
ros) 


where: U denotes the logical complement of U, 
+ denotes the logical OR operator, 
. denotes the logical AND operator, 
® denotes the logical Exclusive OR operator 


— A Jcc instruction used within a DO loop cannot begin at the address LA within that DO 
loop. 


Restrictions: 


— AJcc instruction cannot be repeated using the REP instruction. 
Example: 


JNN (R2) ;jump to P:(R2) if not normalized 


Explanation of Example: In this example, program execution is transferred to the address P:(R2) if the 
result is not normalized. If the specified condition is not true, no jump is taken and the pro- 
gram counter is incremented by one. 
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Jcc Jump Conditionally Jcc 


Condition Codes Affected: _ 
The condition codes are not affected by this instruction. 


Instruction Format and Opcode: 


Jcc XXXX 
15 12 11 8 7 4 3 0 
0 00 0;0 1:41 O}— — 1+#14/¢C CGC Cc ic 
Xx xX xX X|xX xX xX XTX xX xX XIX xX xX xX 
“—” = don’t care 


Instruction Fields:xxxx = 16-bit absolute target address 


Timing: 4 + jx oscillator clock cycles 
Memory: 2 program words 


Instruction Format and Opcode: 


Jcc Rn 
15 12 11 8 7 4 3 0 RR Rn 
000 0j0 1% 0/R R141 O0fe cc cf | RO 
01 R1 
10 R2 
11 R3 
Timing: 4 + jx oscillator clock cycles 
Memory: 1 program word 


Instruction Fields: 


cc = 4-bit condition code = cccc 


Mnemonic Mnemonic 


CS(LO) 
LT 

EQ 

MI 

NR 

ES 

LS 

LE 


CC(HS) 
GE 
NE 
PL 
NN 
EC 
LC 
GT 


i a nl > i o> al a> ae aD} 


-—-e-4+-0000O 


oo oOo0o0cOoO oO Oo 
=—- =- Oo + + O @O 
-ortor-tOoO+-0oO 
=—=—=-/.{" O00 00 
=—- Of O- O00 
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JMP Jump JMP 


Operation: Assembler Syntax: 
label + PC JMP —Xxxx 
Rn —+PC JMP (Rn) 


Description: Jump to the location in program memory at the location given by the instruction’s effective 
address. Long displacement (16-bit signed value) and address register addressing modes 
may be used. 


Restrictions: — A JMP instruction used within a DO loop cannot begin at address LA within that DO 
loop. 


— A JMP instruction cannot be repeated using the REP instruction. 
Example: 


JMP (R2) jump to P:(R2) 
Explanation of Example: In this example, program execution is transferred to the address P:(R2). 


Condition Codes Affected: 


The condition codes are not affected by this instruction. 
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Instruction Format and Opcode: 


JMP XXXX 


“—” = don’t care 


Instruction Fields: 


Jump 
15 12 11 8 7 4 3 
000 0/0 001/001 14/0 1 — 
Xx -%R% %X& X}]X %& XK X}]X XK KX X]_XRK xX 


XXXX = 16-bit signed absolute branch address 


Timing: 4 + jx oscillator clock cycles 
Memory: 2 program words 
Instruction Format and Opcode: 
JMP Rn 
15 1244 7 4 3 0 RR Rn 
000 0/0 0 oO 001 0/0 1R R{| | RO 
01 R1 
10 R2 
11 R3 
Timing: 4 + jx oscillator clock cycles 
Memory: 1 program word 
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JScc Jump to Subroutine Conditionally JScc 


Operation: Assembler Syntax: 
Ifcc, then SP+1 —+4SP JScc — XXxx 

PC — SSH 

SR > SSL 


XXxxX — 3PC 
else PC+1 —PC 


Ifcc, then SP+1 —>SP JScc Rn 
PC — SSH 
SR > SSL 
Rn — PC 


else PC+1 —3PC 


Description: _ If the specified condition is true, program execution continues at the location in program 
memory given by the instruction’s effective address. If the specified condition is false, the 
program counter (PC) is incremented and program execution continues sequentially. Long 
displacement (16-bit signed value) and address register addressing modes may be used. 


The term “cc” may specify the following conditions: 


Condition 


— carry clear (higher or same) 
— carry set(lower) 

— extension clear 

— equal 

— extension set 

— greater than or equal 

— greater than 


— limit clear 

— less than or equal 
— limit set 

— less than 

— minus 

— not equal 

— normalized 

— plus 

— not normalized 


® 


N= 
So _ 


d2d 
mo Mm 
1 


ll 
oO 


denotes the logical OR operator, 
denotes the logical AND operator, 
denotes the logical Exclusive OR operator 
Restrictions: — A JScc instruction used within a DO loop cannot begin at address LA within that DO 
loop. 
— AJScc instruction used within a DO loop cannot specify the loop address LA as its tar- 
get. 


— A JScc instruction cannot be repeated using the REP instruction. 


where: U denotes the logical complement of U, 
+ 


® 
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SS, 


JScc 


JScc 


Example: 
JSLS 


Jump to Subroutine Conditionally 


R2 ;jump to subroutine at P:(R2) if limit set 


Explanation of Example: In this example, program execution is transferred to the subroutine at address 
P:(R2) if the limit bit is set. If the specified condition is not true, no jump is taken and the 
program counter is incremented by one. 

Condition Codes Affected: The condition codes are not affected by this instruction. 

Instruction Format and Opcode: 


JScc XXXX 
15 12 11 8 7 4 3 0 
0 00 0/0 1:41 O}— — 0 1}/c¢ CGC Cc ie 
Xx xX xX X|xX xX xX XIX xX xX XIX xX xX xX 
“—” = don’t care 


Instruction Fields: 


XXxX = 16-bit absolute branch address 


Timing: 4 + jx oscillator clock cycles 
Memory: 2 program words 
Instruction Format and Opcode: 
JScc Rn 
15 12 11 8 7 4 3 0 RR Rn 
000 0/0 114 0/R RO O]le c c cl | ie 
01 R1 
10 R2 
11 R3 
Timing: 4 + jx oscillator clock cycles 
Memory: 1 program word 
Instruction Fields: 
cc = 4-bit condition code = cccc 


Mnemonic 


Mnemonic 


CC(HS) 
GE 
NE 


PL 
NN 
EC 
LC 
GT 


oOoOOO0O0O 0 oO 


i a a > ae a> il a> ian a>} 


—-— =- Oo + $+ OO 


- ort OF 0+ 0 


CS(LO) 
LT 

EQ 

MI 

NR 

ES 

LS 

t= 


-—-- = - OOOO 


-=- = Oo--T--= O00 


- O- O- O- © 
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JSR Jump to Subroutine JSR 


Operation: Assembler Syntax: 
SP+1 —3SP JSR XXXX 
PC — SSH 

SR —+ SSL 

XXxx —23PC 

SP+1 —4SP JSR AA 
PC — SSH 

SR + SSL 

AA — PC 

SP+1 —4SP JSR Rn 
PC — SSH 

SR + SSL 

Rn —> PC 


Description: Jump to subroutine in program memory at the location given by the instruction’s effective 
address. Short displacement (8 bit unsigned value), long displacement (16-bit absolute 
address) and address register addressing modes may be used. 


Restrictions: — A JSR instruction used within a DO loop cannot begin at address LA within that DO 
loop. 
— A JSR instruction used within a DO loop cannot specify the loop address LA as its tar- 
get. 
— A JSR instruction cannot be repeated using the REP instruction. 
Example: 
JSR R2 ;jump to absolute address pointed to by R2 


Explanation of Example: In this example, program execution is transferred the subroutine at address 
P:(R2) 


Condition Codes Affected: 


The condition codes are not affected by this instruction. 
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| 
JSR Jump to Subroutine JSR 


Instruction Format and Opcode: 


JSR XXXX 
15 12 11 8 7 4 3 0 
0 00 0/0 0031/0 0114/0 0— — 
XxX xX xX X|xX xX xX XIX xX xX XIX xX Xx xX 
“—” = don’t care 


Instruction Fields: xxxx = 16-bit signed absolute branch address 


Timing: 4 + jx oscillator clock cycles 
Memory: 2 program words 


Instruction Format and Opcode: 


JSR AA 
15 12 11 8 7 4 83 0 


000 0}1 01 0;A A A ASA AA A 


Instruction Fields: AA...A = 8-bit unsigned absolute short branch address 


Timing: 4 + jx oscillator clock cycles 
Memory: 1 program word 


Instruction Format and Opcode: 


JSR Rn 
15 12 11 8 7 4 3 0 RR Rn 
000 0/0 0 01/0 01 0fo 0 R RI | FO 
01 R1 
10 R2 
11 R3 
Timing: 4 + jx oscillator clock cycles 
Memory: 1 program word 
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L EA Load Effective Address L EA 


Operation: Assembler Syntax: 


ea-> D (no parallel move) LEA ea,D 


Description: The address calculation specified is executed and the resulting effective address is stored 
in the destination register. The source address register and the update mode used to com- 
pute the updated address are specified by the effective address (ea). Note that the source 
address register specified in the effective address is not updated. All update addressing 
modes may be used. 


Note: This instruction is considered to be a move-type instruction. Due to pipelining, the new contents of 
the destination address register (RO-R3 or NO-N3) will not be available for use during the following instruc- 
tion (i.e., there is a single instruction cycle pipeline delay). 


Example: 
LEA (RO)+NO,R1 ;update R1 using (RO)+NO 


Before Execution After Execution 
RO 0003 RO 0003 
NO 0005 NO 0005 
R1 0004 R1 0008 


Explanation of Example: Prior to execution, the 16-bit address register RO contains the value $0003, the 
16-bit address register NO contains the value $0005 and the 16-bit address register R1 con- 
tains the value $0004. Execution of the LEA (RO)+NO,R1 instruction adds the contents of 
the RO register to the contents of the NO register and stores the resulting updated address 
in the R1 address register. The contents of both the RO and NO address registers are not 
affected. 


Condition Codes Affected: 


The condition codes are not affected by this instruction. 
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NN a Panne ee 
| 
L EA Load Effective Address L EA 


Instruction Format: 


LEA ea,Rn 
Opcode: 
15 42 411 8 7 4 3 0 TT Destination 
00 RO 
0 00 0/0 0 014;]1 41 T TIM M R R 01 R{ 
10 R2 
11 R3 
Instruction Format: 
LEA ea,Nn 
Opcode: 
NN Destinati 
15 12 11 8 7 4 3 0 eetnerion 
00 NO 
0 00 0/0 0014/1 0N N{|M M R R 01 N1 
10 N2 
11 N3 
Instruction Fields: 
MMRR Effective Address RR Source 
OORR Rn 00 RO 
01RR (Rn)+ 01 R1 
10RR (Rn)- 10 R2 
11RR (Rn)+Nn 11 R3 
Timing: 4 oscillator clock cycles 
Memory: 1 program word 
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LSL Logical Shift Left LSL 


Assembler Syntax: 


LSL D (parallel move) 
Operation: 
V 
C< | unch. <— unchanged L__0 (parallel move) 
D2 D1 DO 


Description: Logically shift bits 31-16 (D1) of the destination operand D one bit to the left and store the 
result in the destination accumulator upper word D1. The MS bit of D1 (bit 31 of D) is shifted 
into the carry bit C prior to instruction execution and a zero is shifted into the LS bit of the 
D1 (bit 16 of D). 


Example: 
LSL A (R3)- smultiply A1 by 2, update R3 
Before Execution After Execution 
A5 8123 0123 A5 0246 0123 
A2 Al AO A2 Al AO 
0000 0001 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $A5:8123:0123. 
Execution of the LSL A instruction shifts the16-bit value in the A1 accumulator one bit to 
the left and leaves A2 and A1 unchanged. The C bit of CCR (bit 0) is set by the operation 
because bit 31 of A was set prior to the instruction execution. 
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LSL Logical Shift Left LSL 


Condition Codes Affected: 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 #0 


LF}; *;} * | *|S1/S0/ 11/10; S;L)/E;U;)N) ZZ) vic 


— Computed according to the standard definition (see section A.4) 


S$ 

L — Set if limiting (parallel move) or overflow has occurred in result 

N — Set if bit 31 of A or B result is set 

Z — SetifA1 or B1 result equals zero 

V — Always cleared 

C — Set if bit 31 of A or B was set prior to instruction execution 
Instruction Format: 

LSL D (parallel move) 
Opcode: 
15 12 11 8 7 4 3 0 


Please see the “X Memory Data Move” description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


Instruction Fields: 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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LSR Logical Shift Right LSR 


Assembler Syntax: 


LSR D (parallel move) 
Operation: 
0 
4 
unch. — unchanged L__. C¢ (parallel move) 
D2 D1 DO 


Description: Logically shift bits 31-16 (D1) of the destination operand D one bit to the right and store the 
result in the destination accumulator upper word D1. The LS bit of D1 (bit 16 of D) prior to 
instruction execution is shifted into the carry bit C and zero is shifted into the MS bit of D1 (bit 


31 of D). 
Example: 
LSR B X:-(R3),R3 ;divide B1 by 2, update R3, load R3 
Before Execution After Execution 
A8 0001 A865 A8& 0000 A865 
B2 B1 BO B2 B1 BO 
0300 0305 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit B accumulator contains the value 
$A8:0001:A865. Execution of the LSR B instruction shifts the 16-bit value in the B1 register 
one bit to the right and stores the result back in the B1 register. The C bit of CCR (bit 0) is 
set by the operation because bit 0 of A1 was set prior to the instruction execution. The Z bit 
of CCR (bit 2) is also set because the result in A1 is zero. 
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| 
LSR Logical Shift Right LSR 


Condition Codes Affected: 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 #0 


LF}; *;} *| *|S1/S0/ 11/10; S;/L)/E;)U;)N)Z) vic 


S — Computed according to the standard definition (see section A.4) 

L — Set if data limiting has occurred during parallel move 

N — Always cleared 

Z — SetifA1 or B1 result equals zero 

V — Always cleared 

C — Set if bit 16 of A or B was set prior to instruction execution 
Instruction Format: 
LSR D (parallel move) 
Opcode: 

15 12 11 8 7 4 3 0 


Instruction Fields: Please see the “X Memory Data Move” description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2+ mv oscillator clock cycles 
Memory: 1 program words 
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MAC 


MAC Multiply-Accumulate 

Operation: Assembler Syntax: 

D +S1* S25 D (one parallel move) MAC (4)S2,S1,D (one parallel move) 

D +S1 * S2 > D (two parallel reads) MAC $1,S2,D (two parallel reads) 
D+S1*S25D D—>X:(Rn)4+Nn S>D MAC $1,S2,D D,X:(Rn)+Nn S,D 


Description: Multiply the two signed 16-bit source operands S1 and S2 and add/subtract the product to/ 
from the specified 40-bit destination accumulator D. The “-” sign option is used to negate 
the specified product prior to accumulation. This option is not available when two parallel 
read operations are performed. The instruction that accesses D is particularly useful for im- 
plementing the Least Mean Square (LMS) adaptive filter algorithm (see Appendix B). 

Example: 

MAC X1,Y1,A — -X:(R2)+,Y1 X:(R3)+,X1 
Before Execution After Execution 
00 1000 0000 00 0A2B 0000 
A2 Al AO A2 Al AO 
4000 SFFF 
x1 x1 
F456 F454 
Y1 Y1 


Explanation of Example: 


Prior to execution, the 16-bit X1 register contains the value $4000, the 16-bit 
Y1 register contains the value $F456 and the 40-bit A accumulator contains the value 
$00:1000:0000. Execution of the MAC X1,Y1,A instruction multiplies the 16-bit signed val- 
ue in the X1 register by the 16-bit signed value in Y1 and adds the resulting 32-bit product 
to the 40-bit A accumulator and stores the result ($00:0A2B:0000) into the accumulator A. 
In parallel, X1 and Y1 are updated with new values fetched from the data memory and the 
two address registers R2 and R3 are post incremented by one. 


Condition Codes Affected: 


Note: 


k———————_MR ccR ———————> 


15 1413 12 1110 9 8}7 6 5 4 3 2 1 =O 


* 


S1 N| Z 


‘ie 


Ll elu 


s0| | 1 S 


Computed according to the standard definition (see section A.4) 
Set if limiting (parallel move) or overflow has occurred in result 
Set if the signed integer portion of A or B result is in use 

Set according to the standard definition of the U bit 

Set if bit 39 of A or B result is set 

Set if A or B result equals zero 

Set if overflow has occurred in A or B result 


<NZCMro 
| 


The definition of the E and U bits varies according to the scaling mode being used. Please refer to 


Section A.4 entitled “Condition Code Computation” for complete details. 
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MAC 


Instruction Format: 
Opcode: 


15 


1 m R RJH H H WI] 1 k 1 OJ F QQ Q - 


Instruction Fields: 


MAC 


Multiply-Accumulate 


(one parallel move) 


MAC (+)S2,$1,D 


12 11 8 7 4 3 0 


Sign 
+ 


a 


Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


Instruction Format: MAC $1,S2,D (two parallel reads) 

Opcode: 
15 12 11 8 7 4 3 0 
0 1 14mim K K Kit x x OJF 1 QQ 


Instruction Fields: 


Please see the “Dual X Memory Data Read” description in the parallel move sec- 
tion for details on the mm and KKK data fields. 


Instruction Format: MAC $1,S2,D D,X:(Rn)+Nn $0 (one memory write, 
Opcode: one data register move) 
15 12 11 8 7 4 3 0 
0 0014/0 1141 1/R RD DIF QQ Q 


Instruction Fields: 


One Or Two Parallel Operation 


Please see the “X Memory Data Write and Register Data Move’ description in the 
parallel move section for details on the RR and DD data fields. 


Two Parallel Reads 


$1,52,D |QQQ _ F/S$1,S2,D |QQQ F| |S$1,S2,D QqQ_ F|S$1,S2,D QQ F 
X0,X0,A }|000 0}|Y0,X0,A |100 0} | X0,Y0,A 00 0O|X1,Y0,A 10 0 
X0,X0,B |000 11]Y0,X0,B }100 1 X0,Y0,B 00 1/|X1,Y0,B 10 1 
X1,X0,A |001 0} Y1,X0,A }101 Of | X0,Y1,A 01 0O|X1,Y1,A 11 0 
X1,xX0,B |001 1/Y1,X0,B }/101 1 X0,Y1,B 01 1/|X1,Y1,B 11 1 
A1,Y0,A }010 0O/;Y0,X1,A };110 0 
A1,Y0,B }010 1/Y0,X1,B };110 1 
B1,X0,A }011 O/;Y1,X1,AA /111 O 
B1,X0,B }011 1/Y1,X1,B )111 1 

Timing: 2+ mv oscillator clock cycles 

Memory: 1 program word 
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MACR Multiply-Accumulate and Round MACR 


Operation: Assembler Syntax: 
D+S1*S2+r-— D (one parallel move) MACR (+)S2,51,D (one parallel operation) 
D+S1*S2+r- D (two parallel reads) MACR $1,S2,D (two parallel reads) 


Description: Multiply the two signed 16-bit source operands S1 and S2, add/subtract the product to/from 
the specified 40-bit destination accumulator D, and round the result using the specified 
rounding. The rounded result is stored in the destination accumulator. Refer to the round 
instruction for more complete information on the convergent rounding process. The “-” sign 
option is used to negate the specified product prior to accumulation. This option is not avail- 
able when two parallel reads are performed. The default sign option is “+”. 


Example: 
MACR -X0,Y1,A A0,X0 
Before Execution After Execution 
00 1000 1234 00 15D5 0000 
A2 Al AO A2 Al AO 
4000 1234 
XO X0 
F456 F454 
Y1 Y1 


Explanation of Example: Prior to execution, the 16-bit XO register contains the value $4000 (0.5), the 16- 
bit Y1 register contains the value $F456 (-0.0911255) and the 40-bit A accumulator con- 
tains the value $00:1000:1234 (0.125002169981599). Execution of the MACR-X0,Y1,A in- 
struction multiplies the 16-bit signed value in the XO register by the 16-bit signed value in 
Y1 and substracts the resulting 32-bit product to the 40-bit A accumulator, rounds the result 
and stores the result ($00:15D5:0000) into the accumulator A (-XO * Y1 + A = 
0.1705627441 40625). In parallel, AO is saved into XO before the result is stored in A. In this 
example, the default rounding (convergent rounding) is performed. 
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| 
MACR Multiply-Accumulate and Round MACR 


Condition Codes Affected: 
l< MR >* CCR > 
15 14131211109 8/7 6 5 4 3 2 1 0 


LF}; *;} * | *|S1/S0/ 11/10; S;L)/E;)U)N)Z) VC 


— Computed according to the standard definition (see section A.4) 
— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of A or B result is in use 

Set according to the standard definition of the U bit 

— Set if bit 39 of Aor B result is set 

— Set if Aor B result equals zero 

— Set if overflow has occurred in A or B result 


<NZCMro 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: MACR (+)S1,S2,D (one parallel operation) 
Opcode: 
15 12 11 8 7 4 3 0 
Sign | k 
one parallel operation 1 k 11/F QQ Q + 10 
- 44 
Instruction Format: MACR $1,S2,D (two parallel reads) 
Opcode: 
15 12 11 8 7 4 3 0 


two parallel reads 1—— 1/F 1QQ 
“—” = don’t care 


Instruction Fields: 


One Parallel Operation Two Parallel Reads 

$1,82,D |QQQ F/S$1,S2, |QQQ F||$1,S2,D Qq_ F/S$1,S2,D QQ F 
X0,X0,A ]000 OJ Y0,X0,A [100 0O]) X0,Y0,A 00 0O|X1,Y0,A 10 O 
X0,X0,B }000 1/)Y0,X0,B /100 11) X0,Y0,B 00 1/|X1,Y0,B 10 1 
X1,X0,A 001 07) Y1,X0,A |101 0O/]) X0,Y1,A 01 0O|X1,Y1,A 11 0 
X1,X0,B }001 1/Y1,X0,B )101 = 11) X0,Y1,B 01 1/|X1,Y1,B 11 1 
A1,Y0,A }010 0O7jY0,X1,A };110 0 
A1,Y0,B }010 1/Y0,X1,B };110 1 
B1,X0,A }011 O/;Y1,X1,A }/111 O 
B1,X0,B }/011 1/Y1,X1,B )111 1 

Timing: 2 + mv oscillator clock cycles 

Memory: 1 program word 
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MAC(su,uu) Mixed Multiply-Accumulate MAC(su,uu) 


Operation: Assembler Syntax: 
D+S1*S2—5D (S1 unsigned, S2 unsigned) MACuu $1,52,D —_ (no parallel move) 
D+S1*S25D (S1 signed, S2 unsigned) MACsu $1,52,D (no parallel move) 


Description: Multiply the two 16-bit source operands S1 and S2 and add the product to the specified 40- 
bit destination accumulator D. One or two of the source operands can be unsigned. This 
mixed arithmetic multiply-accumulate does not allow a parallel move and can be used for 
multiple precision multiplications. 


Example: 
MACuu X1,Y1,A 
MACsu X1,Y1,A 
x1 Y1 
Before MACuu Execution After MACuu Execution 
00 1000 0000 00 10C3 FFC3 
A2 Al AO A2 Al AO 
Before MACsu Execution After MACsu Execution 
00 10C3 FFC3 C4 10C3 FEFF 
A2 Al AO A2 Al AO 


Explanation of Example: The 16-bit X1 register contains the value $FFFF and the 16-bit Y1 register 
contains the value $0062. 


Execution of the MACuu X1,Y1,A instruction multiplies the 16-bit unsigned value in the X1 
register by the 16-bit unsigned value in Y1, then adds the result to the accumulator A and 
stores the unsigned result back into the accumulator A. 


Execution of the MACsu X1,Y1,A instruction multiplies the 16-bit signed value in the X1 reg- 
ister by the 16-bit unsigned value in Y1, then adds the result to the accumulator A and 
stores the signed result back into the accumulator A. 


Warning: The saturation mode is always disabled during execution of MAC(su,uu), even when the 
saturation bit (SA) of the OMR is set. Refer to Section 5.8.3 for more details. 
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NOL 


MAC\(su,uu) Mixed Multiply-Accumulate MAC(su,uu) 


Condition Codes Affected: 
l< MR >* CCR > 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


LF}; *; * | *|S1/S0/ 11/10; S};L]E;)U)N) Z) VIC 


— Set if the signed integer portion of A or B result is in use 
— Set according to the standard definition of the U bit 

Set if bit 39 of A or B result is set 

— Set if AorB result equals zero 

— Set if overflow has occurred in A or B result 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


<N2ZCM 
| 


Instruction Format: 


MAC(uu) —-$1,S2,D 
MAC(su) —~$1,S2,D 


Opcode: 
15 12 11 8 7 4 3 0 
000 1/01 01/1 41 °1 OJF s QQ 
Instruction Fields: 
$1,S2,D QQ_ F|S$1,S2,D Qq F 
Arithmetic s YO,X0,A 00 0O/|X1,Y0,A 10 O 
YO,X0,B 00 1/X1,Y0,B 10 1 
su 0 Y1,X0,A O01 0} X1,Y1,A 11 0 
uu 1 Y1,X0,B O01 1/X1,Y1,B 11 1 
Note: For MACsu, the order of S1, S2 is sig- 
nificant; the signed value will be taken from 
$1 while the unsigned value will be taken 
from S2. 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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MOVE 


Operation: 


one move 


Move Data MOVE 


Assembler Syntax: 


MOVE (one parallel operation) 


two memory reads MOVE (double memory read) 

one parallel memory move plus MOVE (memory access, register move) 
one data register move 

#xxxx — D (see Move(C) instruction) MOVE #xxXxx,D 


Description: 


Example: 


This instruction is equivalent to a Data ALU NOP with a parallel data move as described in 
Section A.4 entitled “Parallel Move Descriptions”. Refer to that section for more informa- 
tion. 


When a 40-bit accumulator (A or B) is specified as a source operand S, the accumulator 
value is optionally shifted according to the scaling mode bits SO and S1 in the system status 
register (SR). If the data out of the shifter indicates that the accumulator extension register 
is in use and the data is to be moved into a 16-bit destination, the value stored in the des- 
tination D is limited to a maximum positive or negative saturation constant to minimize trun- 
cation error. Limiting does not occur if an individual 16-bit accumulator register (A1, AO, B1, 
or BO) is specified as a source operand instead of the full 40-bit accumulator (A or B). This 
limiting feature allows block floating point operations to be performed with error detection 
since the L bit in the condition code register is latched (i.e., sticky). 


When a 40-bit accumulator (A or B) is specified as a destination operand D, any 16-bit 
source data to be moved into that accumulator is automatically extended to 40 bits by sign- 
extending the MS bit of the source operand (bit 15) and appending the source operand with 
16 LS zeros. Note that the automatic sign-extension and zeroing features may be circum- 
vented by specifying the destination register to be one of the individual 16-bit accumulator 
registers (A1 or B1). 


MOVE X0,A1 ;move XO to A1 without sign extension or zeroing 
Before Last Execution After Last Execution 


FF 
A2 


FFFF FFFF FF 1234 FFFF 


Al AO A2 Al AO 
1234 1234 


X0 X0 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value 


$FF:FFFF:FFFF and the 16-bit XO register contains the value $1234. Execution of the 
MOVE X0,A1 instruction moves the 16-bit value in the XO register into the 16-bit A1 register 
without automatic sign extension and without automatic zeroing. 
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| 
MOVE Move Data MOVE 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 =O 


LF}; *;} * | *|S1/S0/ 11/10; S;L}/E);U)N) 2) VC 


S — Set according to standard definition of the S bit. 
L  — Set if data limiting has occurred during parallel move 


Instruction Format and Opcode: 
MOVE (one parallel move) 
15 12 11 8 7 4 


3 0 

1 m R RH HH W/O 0 0 1/0 0 0 1 
Instruction Format and Opcode: 
3 0 


MOVE (double memory read) 
15 12 11 8 7 4 


v1 imme K Kore 1000. 


Instruction Fields: Please see the “X Memory Data Move” description in the parallel move section for 
details on the m, RR, HHH, and W data fields. See the “Dual X Memory Read” de- 
scription in the parallel move section for details on the mm, KKK, and rr data fields. 


Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 


Instruction Format and Opcode: 
MOVE X:(R2+xx),D ‘for W=0 -or- MOVE S,X:(R2+xx) for W=1 
15 12 11 8 7 4 3 0 


00 0 0;0 10 1f)/B BB B]B B B B 


— ——-—|H H H W/0 0 010 0 0 1 


“—” = don’t care 
Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the HHH and W data fields. 


Timing: 2 + mv oscillator clock cycles 
Memory: 2 program words 
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Parallel Parallel Move Descriptions Parallel 
Move Move 


Thirty two Data ALU instructions provide the capability of specifying an optional parallel operation. This par- 
allel operation can be a data bus movement over the X Data Bus with optional address register update, an 
address register update without data bus movement or a Data ALU register transfer. 


Eight major Data ALU instructions provide the capability of dual X memory read with address register up- 
date. These Data ALU instructions have been selected for optimal performance on frequently used DSP 
algorithm critical loops. 


Two Data ALU instructions, MPY and MAC, provide the capability of one parallel X memory read plus one 
Data ALU register transfer. These two instructions allow for very high performance adaptive transversal fil- 
tering. 


Seven types of parallel moves are permitted, including register to register moves, register to memory moves 
and memory to register moves. However, not all addressing modes are allowed for each type of memory 
reference. Addressing mode restrictions which apply to specific types of moves are noted in the individual 
move operation descriptions. The following section contains detailed descriptions about each type of paral- 
lel move operation. 


The symbols used in decoding the various opcode fields of an instruction or parallel move are completely 
arbitrary. Furthermore, the opcode symbols used in one instruction or parallel move are completely inde- 
pendent of the opcode symbols used in a different instruction or parallel move. 
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Parallel No Parallel Data Move Parallel 
Move Move 
Operation: Assembler Syntax: 


(ea) (...) 


where (...) refers to any arithmetic or logical instruction. 
Description: All Data ALU operations can be performed without any parallel move. 


Example: 
ADD X0,A ;add X0 to A (no parallel move) 


Explanation of Example: This is an example of an instruction which allows parallel moves but doesn’t 
have one. 


Condition Codes Affected: 
The condition codes are not affected by this type of parallel move. 


Instruction Format: 


(...) 
Opcode: 


15 12 11 8 7 4 3 0 


0 1 0 0}1 0 1 0 Data ALU Opcode 


Instruction Fields: (defined by Data ALU instruction) 


Timing: mv oscillator clock cycles 
Memory: mv program words 
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Parallel Register to Register Data Move Parallel 
Move Move 
Operation: Assembler Syntax: 

6p () S,D (...); 


where (...) refers to any arithmetic or logical instruction which allows parallel moves. 


Description: Move the source register S to the destination register D. 

If the arithmetic or logical opcode-operand portion of the instruction specifies a given destination accumu- 
lator, that same accumulator or portion of that accumulator may not be specified as a destination D in the 
parallel data bus move operation. Thus, if the opcode-operand portion of the instruction specifies the 40-bit 
A accumulator as its destination, the parallel data bus move portion of the instruction may not specify AO, 
A1, A2, or A as its destination D. Similarly, if the opcode-operand portion of the instruction specifies the 40- 
bit B accumulator as its destination, the parallel data bus move portion of the instruction may not specify BO, 
B1, B2, or B as its destination D. That is, duplicate destinations are not allowed within the same instruction. 


If the opcode-operand portion of the instruction specifies a given source or destination register, that same 
register or portion of that register may be used as a source S in the parallel data bus move operation. This 
allows data to be moved in the same instruction in which it is being used as a source operand by a Data 
ALU operation. That is, duplicate sources are allowed within the same instruction. 


Note: The MOVE A,B operation will result in a 16-bit positive or negative saturation constant being stored 
in the B1 portion of the B accumulator if the signed integer portion of the A accumulator is in use. 
The opposite is true for the MOVE B,A instruction. 


Example: 
MACR -X0,Y0,B-A,X1 


Before Execution After Execution 
01 0008 789A 01 0008 789A 
A2 Al AO A2 Al AO 
0003 7FFF 
x1 x1 


Explanation of Example: Prior to execution, the 16-bit X1 register contains the value $0003 and the 40- 
bit accumulator A contains the value $01 :0008:789A. Execution of the parallel move portion 
of the instruction, A,X1, moves the contents of A1 into the X1. Limiting is performed by the 
shifter limiter because the data stored in A before instruction execution is using the integer 
portion of A. The example assumes no scaling is selected in the MR register. 
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Parallel Register to Register Data Move Parallel 
Move Move 


Condition Codes: 


MR CCR 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 #0 


S — Set according to the standard definition of the S bit. 
L  — Set if data limiting has occurred during parallel move 


Instruction Format: 
(42) S,D 
Opcode: 
15 12 11 8 7 4 3 0 


0 10 0;! J IT I Data ALU Opcode 


Instruction Fields: 


S,D Hil 
X0,F 0000 
Y0,F 0001 
X1,F 0010 
Y1,F 0011 
A,X0 0100 
B,Y0 0101 
AO,X0 0110 
BO,YO 0111 
EF. 1000 
FF 1001 
A,X1 1100 
B,Y1 1101 
AO,X1 1110 
BO,Y1 1111 


F is the accumulator which is not used by the 
parallel Data ALU operation. 
(in the case of no Data ALU operation, A is chosen) 


Timing: mv oscillator clock cycles 
Memory: mv program words 
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Parallel Address Register Update Parallel 
Move Move 
Operation: Assembler Syntax: 

(...); €a— Rn (...) ea 


where (...) refers to any arithmetic or logical instruction which allows such parallel operations. 


Description: Update the specified address register according to the specified effective addressing 
mode. Two update addressing modes may be used (postdecrement by one; postincrement 
by the offset register). 


Example: 
RND B (R3)+N3 ;round value in B into B1, R38+N3 > R3 
Before Execution After Execution 
R3 0007 R3 000B 
N3 0004 N3 0004 
Explanation of Example: Prior to execution, the 16-bit address register R3 contains the value $0007 


and the 16-bit address offset register N3 contains the value $0004. Execution of the parallel 
move portion of the instruction, (R3)+N3, updates the R3 address register according to the 
specified effective addressing mode by adding the value in the R8 register to the value in 
the N3 register and storing the 16-bit result back in the R3 address register. 


Condition Codes Affected: 
The condition codes are not affected by this type of parallel operation. 
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Parallel Address Register Update Parallel 
Move Move 
Instruction Format: 
(rae) ea 
Opcode: 
15 12 11 8 7 4 3 0 
0 01 1/0 z RR Data ALU Opcode 


Instruction Fields: 


ea z 
R 
Fo | 
(Rn)+Nn | 1 10 Ro 
11 R3 
Timing: mv oscillator clock cycles 
Memory: mv program words 
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Move Move 
Operation: Assembler Syntax: 
(...) X:<ea> > D (...) X:<ea>,D 
(...) S — X:<ea> (...) S,X:<ea> 


where (...) refers to any arithmetic or logical instruction which allows parallel moves. 


Description: Move the specified word operand from/to X memory. Two indirect addressing modes may 
be used (postincrement by one and postincrement by the offset register) as well as a spe- 
cial addressing mode using the upper word of the accumulator which is not used by the 
Data ALU operation. 


If the arithmetic or logical opcode-operand portion of the instruction specifies a given destination accumu- 
lator, that same accumulator or portion of that accumulator may not be specified as a destination D in the 
parallel data bus move operation. Thus, if the opcode-operand portion of the instruction specifies the 40-bit 
A or B accumulator as its destination, the parallel data bus move portion of the instruction may not specify 
AO0/BO, A1/B1, A2/B2, or A/B as its destination D. That is, duplicate destinations are not allowed within the 
same instruction. 


Exceptions: — DEC24, INC24, CLR24, OR, AND, NOT, EOR, LSL, LSR, ROL, and ROR allow the 
lower portion of the accumulator (AO or BO) to be the destination of the parallel move 
even if this accumulator is used by the Data ALU operation because these instructions 
only affect the MS 16 or 24 bits of the accumulator. 


— TST, CMP, CMPM allow both the accumulator and its lower portion (A and AO, B and 
BO) to be the parallel move destination even if this accumulator is used by the Data ALU 
operation. These instructions do not have a true destination. 
If the opcode-operand portion of the instruction specifies a given source or destination register, that same 
register or portion of that register may be used as a source S in the parallel data bus move operation. This 
allows data to be moved in the same instruction in which it is being used as a source operand by a Data 
ALU operation. That is, duplicate sources are allowed within the same instruction. 


Example: 
MOVE #$100,R2 
MOVE #4,X1 
ASL A X1,X:(R2)+ ; A*2 — A; save X1 in X:(R2); increment R2 
Before Execution After Execution 
R2 0100 R2 0101 
X:$100 0000 X:$100 0004 
Explanation of Example: Prior to execution, the 16-bit R2 address register contains the value $100 


and the 16-bit X memory location X:$0100 contains the value $0000. Execution of the parallel move portion 
of the instruction, X1,X:(R2)+ uses the R2 address register to move the contents of the X1 register into the 
16-bit X memory location X:$1000. R2 is then incremented by one. 
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Move Move 


Condition Codes Affected: 
MR CCR 


15 1413 12 1110 9 8}7 6 5 4 3 2 1 =O 


S — Set according to the standard definition of the S bit. 
L  — Set if data limiting has occurred during parallel move 
Note: The MOVE A,X:<ea> or MOVE B,X:<ea> operation will result in a 16-bit positive or negative satu- 
ration constant being stored in the specified 16-bit X memory location if the signed integer portion 
of the A accumulator or B accumulator, respectively, is in use. 
Instruction Format: 


(...) X:<ea>,D 
(...) S,X:<ea> 
Opcode and instruction Fields: 
15 12 11 8 7 4 3 0 
1 m R R{|H H H W Data ALU Opcode 


where “RR” refers to an Address Register RO-R3 


HHH | S,D | HHH | S,D 
Reg. Ww ea m 
000 X0 100 A read S 0 (Rn)+ 0 
001 YO 101 B write D 1 (Rn)+Nn | 1 
010 x1 110 AQ 
011 Y1 111 BO 
Timing: mv oscillator clock cycles Memory: 1 program word 
Instruction Format: 
(...) X:(F1),D 
(...) S,X:(F1) 
Opcode and instruction Fields: 
15 12 11 8 7 4 3 0 
0 10 1/)/H H H W Data ALU Opcode 


HHH | S,D | HHH | S,D 


Reg. W 
000 | X0 | 100 |A yeaa) 0 
001 YO 101 B write D 1 
010 x1 110 AO 


011 Y1 111 BO 


F1 is the upper word of the accumulator which 
is not used by the parallel Data ALU operation 
(in case of no Data ALU operation, A1 is chosen as F) 


Timing: mv oscillator clock cycles Memory: mv program words 
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Parallel x Memory Data Move with short displacement Parallel 


Move Move 
Operation: Assembler Syntax: 

(...) X:(R2+xx) > D (...) X:(R2+xx),D 

(...) S —> X:(R2+xx) (ie) S,X:(R2+xx) 


where (...) refers to any arithmetic or logical instruction which allows parallel moves. 


Description: Move the specified word operand from/to X memory. The indirect addressing mode on R2 
indexed by a short (8 bits) signed displacement value is used. The 8-bit signed value is sign 
extended to 16 bits before being added to R2. For example, X:(R2+$FO) and X:(R2-$10) 
will access the same memory location. 


If the arithmetic or logical opcode-operand portion of the instruction specifies a given destination accumu- 
lator, that same accumulator or portion of that accumulator may not be specified as a destination D in the 
parallel data bus move operation. Thus, if the opcode-operand portion of the instruction specifies the 40-bit 
A accumulator as its destination, the parallel data bus move portion of the instruction may not specify AO, 
A1, A2, or A as its destination D. Similarly, if the opcode-operand portion of the instruction specifies the 40- 
bit B accumulator as its destination, the parallel data bus move portion of the instruction may not specify BO, 
B1, B2, or B as its destination D. That is, duplicate destinations are not allowed within the same instruction. 


If the opcode-operand portion of the instruction specifies a given source or destination register, that same 
register or portion of that register may be used as a source S in the parallel data bus move operation. This 
allows data to be moved in the same instruction in which it is being used as a source operand by a Data 
ALU operation. That is, duplicate sources are allowed within the same instruction. 


Example: 


MOVE #4,X1 
ASL A X1,X:(R2+$64) : A*2 > A; save X1 in X:(R2+$64) 


Before Execution After Execution 


R2 0100 R2 0100 


X:$164 0000 X:$164 0004 


Explanation of Example: Prior to execution, the 16-bit R2 address register contains the value $100 and 
the 16-bit X memory location X:$0100 contains the value $0000. Execution of the parallel 
move portion of the instruction, X1,X:(R2+$64) moves the contents of the X1 register into 
the 16-bit X memory location X:$164. R2 is not affected by the instruction. 
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Parallel x Memory Data Move with short displacement Parallel 
Move Move 


Condition Codes Affected: 
l< MR >* CCR >) 
15 14131211109 8|/7 6 5 4 3 2 1 0 


LF}; *;} * | */S1/S0/ 11/10; S;/L}/E;U)N) 2) VC 


S — Set according to the standard definition of the S bit. 
L  — Set if data limiting has occurred during parallel move 
Note: The MOVE A,X:(R2+xx) or MOVE B,X:(R2+xx) operation will result in a 16-bit positive or negative 
saturation constant being stored in the specified 16-bit X memory location if the signed integer por- 
tion of the A or B accumulator, respectively, is in use. 


Instruction Format: 
(...) X:(R2+xx),D 
(...) S,X:(R2+xXx) 
Opcode and instruction Fields: 


1 12 11 


5 8 7 4 3 0 
0 0 0 0);0 10 1/B B B BB B B B 


H H H W Data ALU OPCODE 


HHH | S,D | HHH | §,D 
Reg. Ww 

010 X1 110 AO 

011 Y1 111 BO 
“—” = don’t care 

BB...B = the 8-bit signed displacement 
Timing: mv oscillator clock cycles 
Memory: mv program words 
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Parallel yx Memory Data Write and Register DataMove Parallel 


Move Move 
Operation: Assembler Syntax: 
(MPY or MAC) D -— X:(Rn)+Nn Ss ~D (MPY or MAC) — D,X:(Rn)+Nn SD 


Description: — In parallel with a MPY or a MAC, move the accumulator which is not used as a destination 
by the MPY or MAC into the X memory location specified by the indirect postincrement by 
offset addressing mode, and update this accumulator with the value contained in one of the 
four Data ALU registers. This parallel memory move with register data move is optimized 
for adaptive digital transversal filtering. 


Note: The X memory write operation will result in a 16-bit positive or negative saturation constant being 
stored in the specified 16-bit X memory location if the signed integer portion of the A or B accumu- 
lator is in use. 


Example: 
MAC YO,X1,B  A,X:(R1)+N1 = X1,A 
Before Execution After Execution 
01 0008 789A 00 0003 0000 
A2 Al AO A2 Al AO 
0003 0003 
x1 x1 
1234 7FFF 
X:(R1) X:(R1) 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $01 :0008:789A, 
the 16-bit X memory location X:$(R1) contains the value $1234 and the 16-bit X1 register 
contains the value $0003. Execution of the parallel move portion of the instruction, 
A,X:(R1)+N1 X1,A moves the 16-bit limited positive saturation constant $7FFF into the 
X:(R1) memory location and then moves the contents of X1 into A. N1 is also added to R1. 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *; * | */|S1/S0/ 11/10; S;L}/E);U)N) 2) VC 


S — Computed according to the standard definition 
L  — Set if data limiting has occurred during parallel move 
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Parallel yx Memory Data Write and Register DataMove Parallel 
Move Move 


Instruction Format: 


(MPY or MAC) D,X:(Rn)+Nn SD 


Opcode and instruction Fields: 


15 12 11 8 7 4 3 0 


0 0 0 14/0 1°41 k}]R R D OD |} Data ALU 


where “RR” refers to an Address Register RO-R3 


D k 
B 0 
A 1 
Timing: mv oscillator clock cycles 
Memory: mv program words 
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Parallel Dual X Memory Data Read Parallel 
Move Move 
Operation: Assembler Syntax: 

(sa) X:<ea> > D1 X:<ea> > D2 (...) X:<ea>,D1 X:<ea>,D2 


where (...) refers to a limited set of arithmetic instructions which allow double parallel 
reads (MOVE, MAC(R), MPY(R), ADD, SUB, TFR) 


Description: Move two 16-bit word operands from X memory. Note that two independent effective ad- 
dresses can be specified where one of the effective addresses uses the Address Registers 
(RO-R2) while the other effective address must use address register R3. Two parallel ad- 
dressing modes may be used for each effective address. In that case, address update on 
R3 is only performed using linear arithmetic (the value of M3 is ignored). D1 and D2 may 
not specify the same register since duplicate destinations are not allowed within the same 
instruction. 


Note: The second X data memory parallel read never accesses on-chip peripherals. If the value 
addressed by R3 reaches the last 64 locations of the X data memory, external memory will 
be accessed. 


Example: 
MPYR X1,Y0,A = X:(RO)+,Y0 X:(R3)+N3,X1 
Before Execution After Execution 
X:(RO)|  FFF4 X:(RO)|  FFF4 
X:(R3) X(R3)) 4324 


X1] 9003 X1] 4321 
YO 4234 YO FFF4 


Explanation of Example: Prior to execution, the 16-bit X1 register contains the value $0003, the 16-bit 
YO register contains the value $1234. Execution of the parallel move portion of the instruc- 
tion, X:(RO)+,YO X:(R3)+N3,X1, moves the 16-bit value in the X memory location X:(RO) 
into the register YO, moves the 16-bit X memory location X:(R3) into the register X1, postin- 
crements by one the 16-bit value in the RO address register and linearly updates R3 using 
the address offset register N3. The contents of the N3 address offset register are not affect- 
ed. 
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Parallel Dual X Memory Data Read Parallel 
Move Move 
Condition Codes: 

The condition codes are not affected by this instruction. 


Instruction Format: 
(...) X:<ea>,D1 X:<ea>,D2 


Opcode and instruction Fields: 


15 12 11 8 7 4 3 0 


Oo 1 #14mim K K KIX ¢r r ut} OPCODE 


where: “rr’ refers to Address Register RO, R1, R2 for the first read 
(R3 has to be used for the second read). 


Bits X and u are part of the opcode. 


Di | D2 KKK ata =e am 
(Rn)+ (R3)+ 00 
x1 XO 010 (Rn)+Nn | (R3)+N3] 11 
Y1 XO 011 
XO | X1 100 
YO | X1 101 
e YO 110 
Y1 x1 111 
Timing: mv oscillator clock cycles 
Memory: mv program words 
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MOVE(C) Move Control Register MOVE(C) 


Operation: Assembler Syntax: 
X:<ea>— D MOVE(C) X:<ea>,D 
$1 X:<ea> MOVE(C) S,X:<ea> 
#XxXxx > D MOVE(C) #XxXXX,D 

S$ >D MOVE(C) $,D 
X:(R2+xx) > D MOVE(C) X:(R2+xx),D 
S—> X:(R2+xx) MOVE(C) S,X:(R2+xx) 


Description: Move the contents of the specified source (control) register S to the specified destination 
or move the specified source to the specified destination (control) register D. The control 
registers S and D consist of the Address ALU modifier registers and the program controller 
registers in addition to the Data ALU registers. These registers may be moved to or from 
any other register or memory space. 


If the system stack register SSH is specified as a source operand, the system stack pointer (SP) is postdec- 
remented by 1 after SSH has been read. If the system stack register SSH is specified as a destination op- 
erand, the system stack pointer (SP) is preincremented by 1 before SSH is written. This allows the system 
stack to be efficiently extended using software stack pointer operations. 


When a 40-bit accumulator (A or B) is specified as a source operand, the accumulator value is optionally 
shifted according to the scaling mode bits SO and S1 in the system status register (SR). If the data out of 
the shifter indicates that the accumulator extension register is in use and the data is to be moved into a 16- 
bit destination, the value stored in the destination is limited to a maximum positive or negative saturation 
constant to minimize truncation error. If the data is to be moved into a 16-bit destination and the accumulator 
extension register is in use, the value is limited to a maximum positive or negative saturation constant whose 
LS 16 bits are then stored in the 16-bit destination register. Limiting does not occur if an individual 16-bit 
accumulator register (A1, AO, B1, or BO) is specified as a source operand instead of the full 40-bit accumu- 
lator (A or B). This limiting feature allows block floating point operations to be performed with error detection 
since the L bit in the condition code register is latched. 


When a 40-bit accumulator (A or B) is specified as a destination operand D, any 16- bit source data to be 
moved into that accumulator is automatically extended to 40 bits by sign-extending the MS bit of the source 
operand (bit 15) and appending the source operand with 16 LS zeros. Whenever the OMR or SP registers 
are source operands to be moved into a 40-bit accumulator, they are first zero extended to form a 16-bit 
operand. Note that for 16-bit source operands both the automatic sign-extension and zeroing features may 
be disabled by specifying the destination register to be one of the individual 16-bit accumulator registers (A1 
or B1). 


Note: Due to pipelining, if an address register (R, N, or M) is changed using a move type instruction, the 
new contents of the destination address register will not be available for use during the following instruction 
(i.e., there is a single instruction cycle pipeline delay). 
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MOVE(C) Move Control Register MOVE(C) 


Resirictions: — A MOVE(C) instruction used within a DO loop which specifies SSH as the source op- 
erand or LA, LC, SR, SP, SSH, or SSL as the destination operand cannot begin at ad- 
dress LA-2, LA-1, or LA within that DO loop. 


— A MOVE(C) instruction which specifies SSH as the source operand or LA, LC, SSH, 
SSL, or SP as the destination operand cannot be used immediately before a DO in- 
struction. 


— A MOVE(C) instruction which specifies SSH as the source operand or LA, LC, SR, 
SSH, SSL, or SP as the destination operand cannot be used immediately before an 
ENDDO instruction. 


— AMOVE(C) instruction which specifies SSH as the source operand or SR, SSH, SSL, 
or SP as the destination operand cannot be used immediately before an RTI instruc- 
tion. 


— A MOVE(C) instruction which specifies SSH as the source operand or SSH, SSL, or 
SP as the destination operand cannot be used immediately before an RTS instruction. 


— A MOVE(C) instruction which specifies SP as the destination operand cannot be used 
immediately before a MOVE(C), MOVE(M), or MOVE(P) instruction which specifies 
SSH or SSL as the source operand. 


— AMOVE(C) SSH, SSH instruction is illegal and cannot be used. 


Example: 
MOVE(C) LC,X0 ; move LC into XO 
Before Execution After Execution 
LC 0100 LC 0100 
X0 3210 X0 0100 


Explanation of Example Prior to execution, the 16-bit loop counter (LC) register contains the value 
$0100 and the 16-bit XO register contains the value $3210. Execution of the MOVE(C) 
LC,X0 instruction moves the contents of the 16-bit LC register into the 16-bit XO register. 
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MOVE(C) Move Control Register MOVE(C) 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *;} * | */S1/S0;/ 11/10; S;/L)/E;)U;)N)Z)Vvic 


For D = SR operand: 


— Set according to bit 7 of the source operand 
— Set according to bit 6 of the source operand 
— Set according to bit 5 of the source operand 
— Set according to bit 4 of the source operand 
Set according to bit 3 of the source operand 
— Set according to bit 2 of the source operand 
— Set according to bit 1 of the source operand 
— Set according to bit 0 of the source operand 


O<NZCMro@ 
| 


For D1 and D2 #SR operand: 


Ss — Set according to the standard definition of the S bit 
L — Set if data limiting has occurred during move 
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MOVE(C) Move Control Register MOVE(C) 


Opcode and Instruction Fields: 


Instruction Format: MOVE(C) X:<ea>,D 
MOVE(C) S,X:<ea> = MM 
15 12 11 8 7 4 3 0 
(Rn) 00 
0011/1 WD DID DD OJ|M M R RR} (Rn)+ 01 
(Rn)- 10 
where “RR” refers to an Address Register RO-R3 a 
Instruction Format: MOVE(C) X:<ea>,D 
MOVE(C) S,X:<ea> 
15 12 11 8 7 4 3 0 ea q 
001%4/1 WD DID DOD 1}q 0 R RJ} (Rn+Nn) | 0 
-(Rn) 1 
Instruction Format: MOVE(C) X:<A1,B1>,D 
MOVE(C) S,X:<A1,B1> 
15 12 11 8 7 4 3 0 J/ea Z 
001414/]/1 WD DID DOD 1/zZ 1 +14 ~—/;{ (At) 0 
(B1) 1 
Instruction Format: MOVE(C) #xxxx,D or MOVE(C) X:xxxx,D or MOVE(C) S,X:xxxx 
15 12 11 8 7 4 3 0 | Extension Word t 
0011/1 W D DID DD 1i/t 1 +O —| | 16-bitlongaddress /|0 
16-bit long data 1 


S/D |DDDDD |S/D |DDDDD |S/D ; DDDDD |S/D |DDDDD Reg. Ww 
XO |00000 |SR 01001 |RO 10000 |SSH | 11000 ben : 
YO ;00001 ;OMR|01010 |RI1 10001 |SSL|11001 
X1|}00010 |SP 01011 |R2 10010 |LA 11010 
Y1 }00011 |A1 01100 |R3 10011 |LC 01000 
A |00100 /|BI1 01101 #|MO 10100 |NO 11100 
B |00101 |A2 01110 #|M1 10101 |N1 11101 
AO }00110 |B2 01111 =|M2 10110 |N2 11110 
BO |}00111 M3 10111 =|N3 11111 
“—” = don’t care 
Timing: 2+ mvc oscillator clock cycles 
Memory: 1 + ea program words 
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MOVE(C) Move Control Register MOVE(C) 


Instruction Format: 
MOVE(C) S,D 


Opcode and Instruction Fields: 


B |00101 |A2 01110 |M1 
AO ;00110 |B2 01111 |M2 


ddddd=DDDDD 


Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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MOVE(C) Move Control Register MOVE(C) 


Instruction Format: 


MOVE(C) X:(R2+xx),D 
MOVE(C) S,X:(R2+xx) 
Opcode and Instruction Fields: 


Reg. WwW 
read S1 | 0 
write D1 | 1 


“xx” refers to a 8-bit data BBBBBBBB 


D DDDDD |p DDDDD |D DDDDD |D DDDDD 
Ss ddddd Ss ddddd s ddddd S ddddd 

X0 |}00000 |SR 01001 |RO 10000 |SSH|11000 
YO }00001 |}OMR|01010 |R1 0001 ;/SSL/11001 
X1}00010 |SP 01011 |R2 0010 |LA 11010 
Yi };00011 |At 01100 |R3 0011 |;LC 01000 
A |00100 /|B1 01101 |MO 0100 |NO 11100 
B |00101 |A2 01110 |Mt1 0101 |N1 11101 
AO }00110 |B2 01111 |M2 0110 |N2 11110 


a ee Cee Cee Cee Cree Camere 


BO |}00111 M3 0111 |N3 11111 
“—” = don’t care 
Timing: 2+mvc oscillator clock cycles 
Memory: 2 program words 
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MOVE (lI) Move Immediate Short MOVE (l) 


Operation: Assembler Syntax: 


#xx > D MOVE(l)  #xx,D 


Description: The 8-bit signed immediate operand is stored in the low byte of destination register D after 
having been sign extended. 


Example: 
MOVE(l) #<$84,X1 ; equivalent to MOVE #<—$7C,X1 
Before Execution After Execution 
FFFF FF84 
x1 x1 


Explanation of Example:Prior to execution, X1 contains the value $FFFF. Execution of the instruction 
moves the value $FF84 into X1. 
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MOVE (lI) Move Immediate Short MOVE (l) 


Condition Codes: 
The condition codes are not affected by this instruction. 


Instruction Format: 
MOVE(I) #xx,D 


Opcode and Instruction Fields: 


“xx” refers to a 8-bit data BBBBBBBB 


Destination | DD 
15 12 11 8 7 4 3 0 
XO 00 
001 0/0 0D D;JB B B B/B BB BI YO 01 
x1 10 
Y1 11 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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MOVE(M) Move Program Memory MOVE(M) 


Operation: Assembler Syntax: 

P:<ea> > D MOVE(M) P:<ea>,D 

S— P:<ea> MOVE(M) S,P:<ea> 
P:(R2+xx) > D MOVE(M) P:(R2+xx),D 
S—> P:(R2+xx) MOVE(M) S,P:(R2+xx) 
P:<ea> > X:<ea> MOVE(M) P:<ea>,X:<ea> 
X:<ea> > P:<ea> MOVE(M) X:<ea>,P:<ea> 
Description: 


Move the specified operand from/to the specified program memory location. The source and destination 
registers S and D may be selected Data ALU registers. 


When a 40-bit accumulator (A or B) is specified as a source operand S, the accumulator value is optionally 
shifted according to the scaling mode bits SO and S1 in the system status register (SR). If the data out of 
the shifter indicates that the accumulator extension register is in use and the data is to be moved into a 16- 
bit destination, the value stored in the destination is limited to a maximum positive or negative saturation 
constant to minimize truncation error. Limiting does not occur if an individual 16-bit accumulator register (A1, 
AO, B1, or BO) is specified as a source operand instead of the full 40-bit accumulator (A or B). This limiting 
feature allows block floating point operations to be performed with error detection since the L bit in the con- 
dition code register is latched. 


When a 40-bit accumulator (A or B) is specified as a destination operand D, any 16-bit source data to be 
moved into that accumulator which is automatically extended to 40 bits by sign-extending the MS bit of the 
source operand (bit 15) and appending the source operand with 16 LS zeros. Note that for 16-bit source 
operands both the automatic sign-extension and zeroing features may be disabled by specifying the desti- 
nation register to be one of the individual 16-bit accumulator registers (A1 or B1). 


Example: 
MOVE(M) P:(R2+N2),A0 smove P:(R2) into the LS word of A (AO), update R2 with N2 


Before Execution After Execution 
A5 8123 0123 A5 0246 0116 
A2 Al AO A2 Al AO 
0116 0116 
P:(R2) P:(R2) 


Explanation of Example: Prior to execution, the 16-bit (AO) register contains the value $0123 and the 16- 
bit program memory location P:(R2) contains the value $0116. Execution of the MOVE(M) 
P:(R2),A0 instruction moves the 16-bit program memory location P:(R2) into the 16-bit AO 
register. R2 is then post incremented by N2. 
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MOVE(M) Move Program Memory MOVE(M) 


Condition Codes Affected: 


I< MR cay CCR 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


LF}; *;} * | *|S1/S0/ 11/10; S;/L)/E)U)N) 2) V)C 


L  — Set if data limiting has occurred during the move 
Instruction Format: 


MOVE(M) S,P:<ea> 
MOVE(M) P:<ea>,D 


Code and Instruction Fields: 


15 12 11 8 7 4 3 0 


(Rn) 
000 0100 1W/R R OMIM HHH (Rn)+ 
(Rn)- 
(Rn) 


where “RR” refers to an Address Register RO-R3 
HHH | S,D | HHH | S,D 


000 | xo | 100 |A Reg. | 
001 |yYo | 101 |B 
010 |x1 | 1410 | AO fead'S 8 


o11 |¥1 | 111 | Bo write D | 1 
Timing: 2+ mvm oscillator clock cycles 
Memory: 1 program words 
A-1 
eee For More Information On This Product 2 


Go to: www.freescale.com 


NP 


SY, 


MOVE(M) Move Program Memory 


Instruction Format: 


MOVE(M) P:<ea>,X:<ea> 
MOVE(M) X:<ea>,P:<ea> 


Code and Instruction Fields: 


15 12 11 8 7 4 3 0 


000 0;0 01 W;y;R Rt 1}mm RR 


where “RR” refers to Address Register RO-R3 


Note: Bits 0, 1, and 2 refer to the destination effective address 


MOVE(M) 


Ss D 

ea ea mm 
(Rn)+ (Rn)+ 00 
(Rn)+ (Rn)+Nn| 01 

(Rn)+Nn | (Rn)+ 10 

(Rn)+Nn | (Rn)+Nn) 14 


Where S and D must use 
different registers. 


while bits 3, 6, and 7 refer to the source effective ad- Reg. W 
dress. vende 0 
write D 1 
Timing: 2 +mvm oscillator clock cycles 
Memory: 1 program word 
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MOVE(M) Move Program Memory MOVE(M) 


Instruction Format: 
MOVE(M) S,P:(R2+xx) 
MOVE(M) P:(R2+xx),D 


Code and Instruction Fields: 
15 12 11 8 7 4 3 0 


000 0;0 10 1f)/;B BB B]B BB B 


000 0;0 01 Wy— — 0 —|— H H H 


HHH | S/D | HHH | S/D 


Reg. WwW 
000 XO 100 A 
001 YO 101 B readS 0 
010 X1 110 AO write D { 


011 Y1 111 BO 
“xx” refers to a 8-bit data BBBBBBBB 


“—” = don’t care 

Timing: 4 +mvm oscillator clock cycles 

Memory: 2 program words 
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MOVE(P) Move Peripheral Data MOV E(P) 


Operation: Assembler Syntax: 

X:<pp> > D MOVE(P) X:<pp>,D 
X:<ea> > X!<pp> MOVE(P) X:<ea>,X!<pp> 
S > X:<pp> MOVE(P) S,X:<pp> 
X:<pp> — X:<ea> MOVE(P) X:<pp>,X:<ea> 


Description: Move the specified operand from/to the specified X I/O peripheral. The I/O Short Absolute 
Addressing mode is used for the I/O peripheral address. Only the (Rn)+ and (Rn)+Nn ad- 
dress register indirect addressing modes are allowed. 


When a 40-bit accumulator (A or B) is specified as a source operand S, the accumulator 
value is optionally shifted according to the scaling mode bits SO and S1 in the system status 
register (SR). If the data out of the shifter indicates that the accumulator extension register 
is in use and the data is to be moved into a 16-bit destination, the value stored in the des- 
tination is limited to a maximum positive or negative saturation constant to minimize trun- 
cation error. Limiting does not occur if an individual 16-bit accumulator register (A1, AO, B1, 
or BO) is specified as a source operand instead of the full 40-bit accumulator (A or B). This 
limiting feature allows block floating point operations to be performed with error detection 
since the L bit in the condition code register is latched. 


When a 40-bit accumulator (A or B) is specified as a destination operand D, any 16-bit 
source data to be moved into that accumulator is automatically extended to 40 bits by sign- 
extending the MS bit of the source operand (bit 15) and appending the source operand with 
16 LS zeros. Note that for 16-bit source operands both the automatic sign-extension and 
zeroing features may be disabled by specifying the destination register to be one of the in- 
dividual 24-bit accumulator registers (A1 or B1). 


Example: 
MOVE(P) A,X:<$FFE2 sinitialize Port B Data Register 


Before Execution After Execution 
X:$FFE2 FFFF er 0024 
A 0024 A 0024 
Explanation of Example: Prior to execution, the 16-bit, X Memory-mapped Port B Data Register (PBD) 
contains the value $FFFF. Execution of the MOVE(P) A,X:<$FFE2 instruction moves the 
value $0024 contained in A into the 16-bit Port B Data Register (PBD), resulting in pins PB2 


and PB5 remaining set while all other pins of port B are cleared (the example assumes that 
all port B pins are programmed as output). 
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MOVE(P) Move Peripheral Data MOV E( P) 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 =O 


LF}; *;} * | *|S1/S0)/ 11/10; S;/L)/E;)U)N) 2) VC 


S — Set according to the standard definition of the S bit 
L — Set if data limiting has occurred during move 


Opcode and Instruction Fields: 


Instruction Format: 


MOVE(P) X:<pp>,D 
MOVE(P) S,X:<pp> 


S,D HH 
15 12 11 8 7 4 3 0 
XO 00 
000 1/1 0 0W{iH H 1 pip p p pj} Yo 01 
A 10 
B 11 
Instruction Format: 
MOVE(P) X:<ea>,X:<pp> 
MOVE(P) X:<pp>,X:<ea> 
15 12 11 8 7 4 3 0 ea m 
0 0 0 0/1 10W;R Rm pjyp p p p (Rn)+ |0 
(Rn)+Nn | 4 
where “RR” refers to an Address Register RO-R3 Reg. Ww 
pp = 5-bit absolute address = ppppp readS |0 
write D4 
Timing: 4 + mvp oscillator clock cycles 
Memory: 1 program word 
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MOVE(S) Move Absolute Short MOV E(S) 


Operation: Assembler Syntax: 
X:<aa> > D MOVE(S) X:<aa>,D 
S > X:<aa> MOVE(S) S,X:<aa> 


Description: Move the specified operand from/to the lower 32 memory locations in X Data memory. The 
5-bit Absolute short address is zero extended 


When a 40-bit accumulator (A or B) is specified as a source operand S, the accumulator 
value is optionally shifted according to the scaling mode bits SO and S1 in the system status 
register (SR). If the data out of the shifter indicates that the accumulator extension register 
is in use and the data is to be moved into a 16-bit destination, the value stored in the des- 
tination is limited to a maximum positive or negative saturation constant to minimize trun- 
cation error. Limiting does not occur if an individual 16-bit accumulator register (A1, AO, B1, 
or BO) is specified as a source operand instead of the full 40-bit accumulator (A or B). This 
limiting feature allows block floating point operations to be performed with error detection 
since the L bit in the condition code register is latched. 


When a 40-bit accumulator (A or B) is specified as a destination operand D, any 16-bit 
source data to be moved into that accumulator is automatically extended to 40 bits by sign- 
extending the MS bit of the source operand (bit 15) and appending the source operand with 
16 LS zeros. Note that for 16-bit source operands both the automatic sign-extension and 
zeroing features may be disabled by specifying the destination register to be one of the in- 
dividual 24-bit accumulator registers (A1 or B1). 

Example: 


MOVE(S) A,X:<$10 initialize X:$0 
Before Execution After Execution 
A 0024 a . 0024 
X:$0010 FFFF X:$0010 0024 
Explanation of Example: Prior to execution, X:$10 contains the value $FFFF. Execution of the instruc- 
tion moves the value $0024 into the memory location 


A - 158 INSTRUCTION SET MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


Freescale Semiconductor tne ——_ 


MOVE(S) Move Absolute Short 


Condition Codes Affected: 
I< MR >K CCR 


MOVE(S) 


15 141312 1110 9 8|/7 6 5 4 3 2 0 
LF} *;} *| */S1/S0O} 11/10}; S$} LE} USN] Z Cc 
S — Set according to the standard definition of the S bit 
L — Set if data limiting has occurred during move 


Instruction Format: 


MOVE(S) X:<aa>,D 
MOVE(S) S,X:<aa> 


Opcode and Instruction Fields: 


15 12 11 8 


0 00 14]1 0 0 WI{|H H 0 ala aaa 


where “aa” refers to a 5-bit absolute address 


Timing: 2 + mvs oscillator clock cycles 
Memory: 1 program word 


Reg. 


read S 
write D 
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MPY Signed Multiply MPY 


Operation: Assembler Syntax: 

+81 * S2- D (one parallel move) MPY  (+)S1,S2,D (one parallel move) 
S1 * S2 = D (two parallel reads) MPY  $1,S2,D (two parallel reads) 
$1*S2—D D> X:(Rn)+Nn S3D MPY  $1,S2,D D,X:(Rn)+Nn S,D 


Description: Multiply the two signed 16-bit source operands S1 and S2 and store the product in the 
specified 40-bit destination accumulator D. The “-” sign option is used to negate the spec- 
ified product. This option is not available when two parallel reads are performed. The de- 
fault sign option is “+”. The instruction which accesses D is particularly useful for imple- 
menting the Least Mean Square (LMS) adaptive filter algorithm (see Appendix B). 


Example: 
MPY X1,Y1,A  A,X1 ; multiply X1 by Y1, save A in X1 first 
Before Execution After Execution 
00 1000 0000 FF FA2B 0000 
A2 Al AO A2 Al AO 
4000 1000 
x1 x1 
F456 F454 
Y1 Y1 


Explanation of Example: Prior to execution, the 16-bit X1 register contains the value $4000 (0.5), the 16- 
bit Y1 register contains the value $F456 (-0.0911255) and the 40-bit A accumulator con- 
tains the value $00:1000:0000 (0.125). Execution of the MPY X1,Y1,A instruction multiplies 
the 16-bit signed value in the X1 register by the 16-bit signed value in Y1 and stores the 
result (SF F:FA2B:0000) into the accumulator A (X1 * Y1 = -0.045562744140625). In paral- 
lel, A has been saved into X1. 


Condition Codes Affected: 


15 1413 12 1110 9 8/7 6 5 4 3 2 1 #0 


LF) *) *] *|S1)/S0) 11) 10) S|] L|] E] UJ Nj] Z|] Vic 


— Computed according to the standard definition (see section A.4) 
— Set if limiting (parallel move) or overflow (result) has occurred 
— Set if the signed integer portion of A or B result is in use 

Set according to the standard definition of the U bit 

— Set if bit 39 of A or B result is set 

— Set if Aor B result equals zero 

— Set if overflow has occurred in A or B result 


<NZCMro 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 
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MPY Signed Multiply MPY 


Instruction Format: MPY (+)S2,51,D (one parallel move) 
Opcode: 
15 12 11 8 7 4 3 0 

Sign | k 


1 m R R H H H H]1 k 0 OJF Q QQ + 10 


— 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


Instruction Format: MPY $1,S2,D (two parallel reads) 
Opcode: 
15 12 11 8 7 4 3 0 


D1 imme K KT x x off 0 ac 


Instruction Fields: Please see the “Dual X Memory Data Read” description in the parallel move sec- 
tion for details on the mm and KKK data fields. 


Instruction Format: MPY $1,S2,D D,X:(Rn)+Nn S,D (one memory write, 
Opcode: one data register move) 
15 12 11 8 7 4 3 0 


0001/0 11 0/R RD DIF QQ Q 


Instruction Fields: Please see the “X Memory Data Write and Register Data Move’ description in the 
parallel move section for details on the RR and DD data fields. 


Instruction Fields: 


One Or Two Parallel Operation Two Parallel Reads 

$1,S2,D |QQQ_ F/|S$1,S2,D |QQQ F| | $1,S2,D QqQ_ F|S$1,S2,D QQ F 
X0,X0,A }000 0] Y0,X0,A }100 0} | X0,Y0,A 00 0O|X1,Y0,A 10 O 
X0,X0,B }000 1/]Y0,X0,B }100 1/] | X0,Y0,B 00 1/X1,Y0,B 10 1 
X1,X0,A }001 0] Y1,X0,A }101 Of | X0,Y1,A 01 0} X1,Y1,A 11 0 
X1,X0,B |001 1/Y1,X0,B }101 1/4 | X0,Y1,B O01 1/X1,Y1,B 11 1 
A1,Y0,A }010 0O7;Y0,X1,A }110 0 
A1,Y0,B }010 1/)/Y0,X1,B }110 1 
B1,X0,A }011 O/;Y1,X1,A }/111 O 
B1,X0,B |}011 1/Y1,X1,B }/111~= 1 

Timing: 2 + mv oscillator clock cycles 

Memory: 1 program word 
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Freescale Semiconductor, tne ——_ 


MPYR Signed Multiply and Round MPYR 


Operation: Assembler Syntax: 
+S$1*S2+r-— D (one parallel move) MPYR (+)S1,S2,D (one parallel move) 
$1 * $2 +r-— D (two parallel reads) MPYR $1,S2,D (two parallel reads) 


Description: Multiply the two signed 16-bit source operands S1 and S2, round the result using the spec- 
ified rounding and store it in the specified 40-bit destination accumulator D. Refer to the 
round instruction for more complete information on the convergent rounding process. The 
“.” sign option is used to negate the specified product. This option is not available when two 
parallel reads are performed. The default sign option is “+”. 


Example: 
MPYR~ -X0,Y1,A = A0O,X0 ; multiply XO by Y1 and negate the product, first save AO in XO 
Before Execution After Execution 
00 1000 1234 FF FE8B 0000 
A2 Al AO A2 Al AO 
4000 1234 
XO X0 
F456 F454 
Y1 Y1 


Explanation of Example: Prior to execution, the 16-bit XO register contains the value $4000 (0.5), the 16- 
bit Y1 register contains the value $F456 (-0.0911255) and the 40-bit A accumulator con- 
tains the value $00:1000:1234 (0.125002169981599). Execution of the MPYR -X0,Y1,A in- 
struction multiplies the 16-bit signed value in the XO register by the 16-bit signed value in 
Y1, rounds the result and stores the result ($FF:FE8B:0000) into the accumulator A (-X0 * 
Y1 = -0.011383056640625). In parallel, AO is saved into XO before the result is stored in A. 
In this example, the default rounding (convergent rounding) is performed. 


Condition Codes Affected: 


MR CCR 
15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


LF}; *;} * | *|S1/S0/ 11/10; S;L)/E)U)N)Z) VC 


— Computed according to the standard definition (see section A.4) 
— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of A or B result is in use 

Set according to the standard definition of the U bit 

— Set if bit 39 of A or B result is set 

— Set if A or B result equals zero 

— Set if overflow has occurred in A or B result 


<NZCMmro 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 
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(Lenin 


MPYR Signed Multiply and Round MPYR 


Instruction Format: MPYR (+)S2,S1,D (one parallel move) 
Opcode: 
15 12 11 8 7 4 3 0 
Sign | k 
one parallel operation 1 k 0 1/F QQ Q + |0 
- 1 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


One Parallel Operation 


$1,52,D |QQQ_ F/|S1,S2,D |QQQ F 
X0,X0,A |000 0O/Y0,X0,AA }100 0O 
X0,X0,B |000 1{Y0,X0,B |100 1 
X1,X0,A |001 O}Y1,X0,A }101 O 
X1,X0,B |001 1{Y1,X0,B }101 1 
A1,Y0,A }010 0O}Y0,X1,AA }110 0O 
A1,Y0,B }010 1/]Y0,X1,B 110 1 
B1,X0,A |}011 O/Y1,X1,A }/111 O 
B1,xX0,B }011 1/Y1,X1,B }111 1 

Instruction Format: MPYR $1,S2,D (two parallel reads) 

Opcode: 

15 12 11 8 7 4 3 0 

two parallel reads 1—— 1/F 0QQ 


Instruction Fields: Please see the “Dual X Memory Data Read” description in the parallel move sec- 
tion for details on the mm and KKK data fields. 


Two Parallel Reads 


$1,S2,D Qq_ F/S$1,S2,D QQ F 

X0,Y0,A 00 0O|X1,Y0,A 10 O 

X0,Y0,B 00 1/|X1,Y0,B 10 1 

X0,Y1,A 01 0O|X1,Y1,A 11 0 

X0,Y1,B 01 1/|X1,Y1,B 11 1 
“—” = don’t care 
Timing: 2+ mv oscillator clock cycles 
Memory: 1 program word 

Tl E A-1 

eee For More Information On Fhis Product i 


Go to: www.freescale.com 


SS, 


MP Y(su,uu) Mixed Multiply MP Y(su,uu) 


Operation: Assembler Syntax: 


$1*S2—4D _— (S81 unsigned, S2 unsigned) MPYuu $1,S2,D (no parallel move) 
$1*S2—4D _— (S81 signed, S2 unsigned) MPYsu $1,S2,D (no parallel move) 


Description: Multiply the two 16-bit source operands S1 and S2 and store the product to the specified 
40-bit destination accumulator D. One or two of the source operands can be unsigned. This 
mixed arithmetic multiply does not allow a parallel move and can be used for multiple pre- 
cision multiplications. 


Example: 
MPYuu X1,Y1,A 
MPYsu X1,Y1,A 
Before Execution After Execution 
x1 FFFF x1 FFFF 
Y1 0062 Y1 0062 
Before MPYuu Execution After MPYuu Execution 
| 00 | 1000 0000 | 00 | 00C3 FFC3 
A2 Al AO A2 Al AO 
Before MPYsu Execution After MPYsu Execution 
| 00 | 00C3 FFC3 FFFF FFC3 
A2 Al AO A2 Al AO 


Explanation of Example: The 16-bit X1 register contains the value $FFFF and the 16-bit Y1 register 
contains the value $0062. 
Execution of the MPYuu X1,Y1,A instruction multiplies the 16-bit unsigned value in the X1 
register by the 16-bit unsigned value in Y1 and stores the unsigned result into the accumu- 
lator A. 
Execution of the MPYsu X1,Y1,A instruction multiplies the 16-bit signed value in the X1 reg- 
ister by the 16-bit unsigned value in Y1and stores the signed result into the accumulator A. 


Warning: The saturation mode is always disabled during execution of MPY(su,uu), even when the 
saturation bit (SA) of the OMR is set. Refer to Section 5.8.3 for more details. 
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NOL 


MP Y(su,uu) Mixed Multiply MP Y(su,uu) 


Condition Codes Affected: 
l< MR >* CCR > 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


LF}; *; * | *|S1/S0/ 11/10; S};L]E;)U)N) Z) VIC 


— Set if the signed integer portion of A or B result is in use 
— Set according to the standard definition of the U bit 

Set if bit 39 of A or B result is set 

— Set if AorB result equals zero 

— Set if overflow has occurred in A or B result 


<N2ZCM 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. 
Please refer to Section A.4 entitled “Condition Code Computation” for complete details. 
Instruction Format: 


MPY(uu) —-$1,S2,D 
MPY(su) —-$1,S2,D 


Opcode: 
15 12 11 8 7 4 3 0 
ooo 1b 101/11 00/Fsa0 
Instruction Fields: 
: : $1,S2,D QQ_ F|S$1,S2,D Qq F 
Arithmetic s Y0,X0,A 00 OJ] X1,Y0,A 10 O 
YO,X0,B 00 1/X1,Y0,B 10 1 
Su 0 Y1,X0,A O01 0} X1,Y1,A 11 0 
uu 1 Y1,X0,B O01 1/X1,Y1,B 11 1 
Note: For MPYsu, the order of S1, S2 is sig- 
nificant; the signed value will be taken from 
S1 and the unsigned value will be taken from 
S2 (i.e., MPYSU Y1, X0, A is legal whereas 
MPYSU XO, Y1, A is illegal). 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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N EG Negate Accumulator N EG 


Operation: Assembler Syntax: 


0-D> D (parallel move) NEG D (parallel move) 


Description: The destination operand D is substracted from zero and the result is stored in the destina- 
tion accumulator. 


Example: 
NEG B X1,X:(R8)+ ;0-B > B, save X1, update R3 
A Before Execution A After Execution 
00 1234 5678 FF EDCB A988 
A2 Al AO A2 Al AO 
0300 0309 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit B accumulator contains the value $00:1234:5678. 
The NEG B instruction takes the two’s complement of the value in the B accumulator and 
stores the 40-bit result back in the B accumulator. 


Condition Codes Affected: 


I< MR Cau CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *|} * | *|S1/S0/ 11/10; S;L)/E;)U)N)Z) VC 


— Computed according to the standard definition (see section A.4) 
— Set if a borrow is generated from the MSB of the result. 

— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of A or B is in use 

Set according to the standard definition of the U bit 

— Set if bit 39 of A or B result is set 

— Set if A or B result equals zero 

— Set if overflow has occurred in A or B result 


<NzZCmroow 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 
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| 
N EG Negate Accumulator N EG 


Instruction Format: 


NEG D (parallel move) 
Opcode: 


15 12 11 8 7 4 3 0 


1 m R R}JH H H W]}O 1 14 O;F 0 0 0 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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| 
N EGC Negate Accumulator with Carry N EGC 


Assembler Syntax: 


Operation: 
0-D—> D (no parallel move) NEGC D (no parallel move) 
Description: The destination operand D is substracted from zero along with the C bit of the condition 
code register (CCR) and the result is stored in the destination accumulator. 
Example: 
NEGC B 
A Before Execution A After Execution 
00 1234 5678 FF EDCB A987 
A2 Al AO A2 Al AO 
0301 0309 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit B accumulator contains the value $00:1234:5678. 
The NEGC B instruction substracts from zero the value in the B accumulator along with the 
carry bit C of CCR and stores the 40-bit result back in the B accumulator. 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *|} * | *|S1/S0/ 11/10; S};L]E;)U)N) Z) vic 


— Set if the signed integer portion of A or B is in use 

— Set according to the standard definition of the U bit 

— Set if bit 39 of Aor B result is set 

Set if A or B result equals zero. Cleared otherwise 

— Set if overflow has occurred in A or B result 

— Set if a borrow is generated from the MSB of the result. 


O<N2Z2C™M 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 
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N EGC Negate Accumulator with Carry N EGC 


Instruction Format: 


NEGC D 
Opcode: 


0 00 1/0 10 1/0 1 °1 0;F 0 0 0 


Instruction Fields: 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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NOP No Operation NO P 


Operation: Assembler Syntax: 


PC+1 — PC NOP 


Description: Increment the program counter (PC). Pending pipeline actions, if any, are completed. Ex- 
ecution continues with the instruction following the NOP. 


Example: 


NOP increment the program counter 


Explanation of Example: 
The NOP instruction increments the program counter (PC) and completes any pending 
pipeline actions. 


Condition Codes Affected: 
The condition codes are not affected by this instruction. 
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| 
NO P No Operation NO P 


Instruction Format: 


NOP 
Opcode: 


0 00 0/0 0 0 0;0 0 0 0;0 0 0 0 


Instruction Fields: none 


Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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NOL 


NORM Normalize Accumulator Iteration NORM 


Operation: Assembler Syntax: 
If E°U-Z=1, then ASLDandRn-1-—Rn NORM Rn,D 
else if E=1, then ASRDandRn+1—>Rn 


else NOP 


where E denotes the logical complement of E, and 
where « denotes the logical AND operator 


Description: Perform one normalization iteration on the specified destination operand D, update the 
specified address register Rn based upon the results of that iteration, and store the result 
back in the destination accumulator. This is a 40-bit operation. If the accumulator extension 
is not in use and the accumulator is unnormalized and the accumulator is not zero, the des- 
tination operand is arithmetically shifted one bit to the left and the specified address register 
is decremented by one. If the accumulator extension register is in use, the destination op- 
erand is arithmetically shifted one bit to the right and the specified address register is incre- 
mented by one. If the accumulator is normalized or zero, a NOP is executed and the spec- 
ified address register is not affected. Since the operation of the NORM instruction depends 
on the E, U, and Z condition code register bits, these bits must correctly reflect the current 
state of the destination accumulator prior to executing the NORM instruction. Note that the 
L and V bits in the condition code register will be cleared unless they have been improperly 
set up prior to executing the NORM instruction. 


Example: 
REP #$1F ;maximum number of iterations (31) needed 
NORM R3,A perform 1 normalization iteration 
Before Execution After Execution 
00 0000 0001 00 4000 0000 
A2 Al AO A2 Al AO 
0000 FFE2 
R3 R3 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $00:0000:0001 
and the 16-bit R3 address register contains the value $0000. The repetition of the NORM 
R3,A instruction normalizes the value in the 40-bit accumulator and stores the resulting 
number of shifts performed during that normalization process in the R3 address register. A 
negative value reflects the number of left shifts performed while a positive value reflects the 
number of right shifts performed during the normalization process. In this example, thirty 
left shifts are required for normalization. 
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NOL 


NORM Normalize Accumulator Iteration NORM 


Condition Codes Affected: 


MR > CCR >| 


I< 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


LF| * 


* 


*1S1/S0) 11} 10; S| L|E;|U;N{|Z;)/ VIC 


— Computed according to the standard definition (see section A.4) 


— Set if overflow has occurred in A or B result 


— Set if the signed integer portion of A or B result is in use 
Set according to the standard definition of the U bit 


— Set if bit 39 of A or B result is set 
— Set if Aor B result equals zero 


<NZCMro 
| 


— Set if bit 39 is changed as a result of a left shift 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: 
NORM Rn,D 
Opcode: 


15 


12 11 8 7 4 83 0 


0 O 


01/0 1:0 1/0 0 1 0}/F 0 R R 


Instruction Fields: 


Rn RR 
D F 
RO 00 
A 0 R1 01 
B 1 R2 10 
R3 11 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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NOT Logical Complement NOT 


Operation: Assembler Syntax: 


D[31:16] > D[81:16] (parallel move) NOT D (parallel move) 


where the bar over the D (D) denotes the logical NOT operator 


Description: Take the one’s complement of bits 31-16 of the destination operand D and store the result 
back in bits 31-16 of the destination accumulator. This is a 16-bit operation. The remaining 


bits of D are not affected. 


Example: 
NOTA A,X:(R2)+ ssave A1 and take the 1’s complement of A1 
Before Execution After Execution 
00 1234 5678 00 EDCB 5678 
A2 Al AO A2 Al AO 
0300 0308 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $00:1234:5678. 
The NOT A instruction takes the one’s complement of bits 31-16 of the A accumulator (A1) 
and stores the result back in the A1 register. The remaining A accumulator bits are not af- 


fected. 


Condition Codes Affected: 


I< MR >* CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF; *;} * | *|S1/S0/ 11/10; S;L)/E;)U)N)Z) Vic 


L  — Set if data limiting has occurred during parallel move 
N — Set if bit 31 of A or B result is set 
Z — Set if bits 31-16 of A or B result are zero 
V — Always cleared 
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| 
NOT Logical Complement NOT 


Instruction Format: 


NOT D (parallel move) 
Opcode: 


1 m R RJH H H W/O 1 14 O|F O O 1 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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OR Logical Inclusive OR OR 


Operation: Assembler Syntax: 


S + D[31:16] > D[31:16] — (parallel move) OR §,D (parallel move) 


where + denotes the logical inclusive OR operator 


Description: Logically inclusive OR the source operand S with bits 31:16 of the destination operand D 
and store the result in bits 31-16 of the destination accumulator. This instruction is a 16-bit 
operation. The remaining bits of the destination operand D are not affected. 


Example: 
OR Y1,B B,X:(A1) ;ssave B1, OR Y1 with B 
Before Execution After Execution 
00 1234 5678 00 FF34 5678 
B2 B1 BO B2 B1 BO 
FFOO FFOO 
Y1 Y1 


Explanation of Example: Prior to execution, the 16-bit Y1 register contains the value $FFOO and the 40- 
bit B accumulator contains the value $00:1234:5678. The OR Y1,B instruction logically 
OR’s the 16-bit value in the Y1 register with bits 31:16 of the B accumulator (B1) and stores 
the 40-bit result in the B accumulator. 


Condition Codes Affected: 


< MR Cau CCR >| 
15 141312 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *;} * | */|S1/S0/ 11/10; S;L)/E;)U;)N) 2) Vic 


S — Computed according to the standard definition (see section A.4) 
L  — Set if data limiting has occurred during parallel move 
N — Set if bit 31 of A or B result is set 
Z — Set if bits 31-16 of A or B result are zero 
V — Always cleared 
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OR Logical Inclusive OR OR 


Instruction Format: 


OR $,D (parallel move) 
Opcode: 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


S,D |JJ F 
X0,A |00 0 
X0,B | 00 1 
YO,A |01 0 
YO,B |01 1 
X1,A |10 0 
X1,B }10 1 
Y1,A }/11 0 
Y1,B /11 1 

Timing: 2 + mv oscillator clock cycles 

Memory: 1 program word 
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ORI ORImmediate ORI 


Operation: Assembler Syntax: 


#xx+D— OD OR(l) #xx,D 


where + denotes the logical inclusive OR operator 


Description: Logically OR the 8-bit immediate operand (#xx) with the contents of the destination control 
register D and store the result in the destination control register. The condition codes are 
affected only when the condition code register (CCR) is specified as the destination oper- 
and. 

Resirictions: The ORI #xx,MR instruction cannot be used immediately before an ENDDO or RTI instruc- 


tion and cannot be one of the last three instructions in a DO loop (at LA-2, LA-1, or LA). The 
ORI #xx,CCR instruction cannot be used immediately before an RTI instruction. 


Example: 
OR #$8,MR set scaling mode bit S1 to scale up 
SR Before Execution SR After Execution 
0300 0OBOO 
MR:CCR MR:CCR 


Explanation of Example: Prior to execution, the 8-bit mode register (MR) contains the value $03. The 
OR #$8,MR instruction logically OR’s the immediate 8-bit value $8 with the contents of the 
mode register and stores the result in the mode register. 


A-178 INSTRUCTION SET MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


| 
ORI ORImmediate ORI 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *;} * | *|S1/S0/ 11/10; S;/L)/E;)U;)N)Z)vic 


For CCR operand: 
S — Set if bit 7 of the immediate operand is set 
L — Set if bit 6 of the immediate operand is set 
E — Set if bit 5 of the immediate operand is set 
U — Set if bit 4 of the immediate operand is set 
N — Set if bit 3 of the immediate operand is set 
Z — Set if bit 2 of the immediate operand is set 
V — Set if bit 1 of the immediate operand is set 
C — Set if bit 0 of the immediate operand is set 


For MR and OMR operands: 


The condition codes are not affected using these operands 
Instruction Format: 


OR(I) #xx,D 
Opcode: 


15 12 11 8 7 4 3 0 


000 %14;]1 E E fyi ft ft ti it i t 


Instruction Fields:: 


Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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R E P Repeat Next Instruction R E P 


Operation: Assembler Syntax: 


LC— TEMP; X:(Rn) > LC REP X:(Rn) 
Repeat next instruction until LC = 1 
TEMP > LC 


LC > TEMP; #xx > LC REP #xx 
Repeat next instruction until LC = 1 
TEMP > LC 


LC —+ TEMP; S, > LC REP § 

Repeat next instruction until LC = 1 

TEMP + LC 

Description: Repeat the single word instruction immediately following the REP instruction the specified 
number of times. The value specifying the number of times the given instruction is to be 
repeated is loaded into the 16-bit loop counter (LC) register. The single word instruction is 
then executed the specified number of times, decrementing the loop counter (LC) after 
each execution until (LC) = 1. When the REP instruction is in effect, the repeated instruction 
is fetched only one time and it remains in the instruction register for the duration of the loop 
count. Thus, the REP instruction is not interruptible. The current loop counter (LC) value is 
stored in an internal temporary register. If LC is set equal to Zero, the instruction is not re- 
peated. The instruction’s effective address specifies the address of the value which is to be 
loaded into the loop counter (LC). 


If the A or B accumulator is specified as a source operand, the accumulator value is option- 
ally shifted according to the scaling mode bits SO and S1 in the system status register (SR). 
If the data out of the shifter indicates that the accumulator extension is in use, the value to 
be loaded into the loop counter (LC) register will be limited to a 16-bit maximum positive or 
negative saturation constant to minimize the error due to truncation. The resulting values 
are then stored in the 16-bit loop counter (LC) register. 


If the system stack register SSH is specified as a source operand, the system stack pointer 
(SP) is postdecremented by 1 after SSH has been read. 


Restrictions: 


The REP instruction can repeat any single word instruction except the REP instruction itself and any instruc- 
tion that changes program flow. The following instructions are not allowed to follow a REP instruction: 


Immediately after REP DO, BRKcc Bcc, Jcc DEBUG, DEBUGcc 
JCLR BRA, JMP WAIT 
BScc, JScc BSR, JSR 
REP, REPcc RTI 
RTS STOP 
SWI Tec 


Also, a REP instruction cannot be the last instruction in a DO loop (at LA). The assembler will generate an 
error if any of the above instructions are found immediately following a REP instruction. 


A- 180 INSTRUCTION SET MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


(Line 


R E P Repeat Next Instruction R E P 


Example: 
REP X0 repeat (XO) times 
MAC X1,Y1,A X:(R1)+,X1 X:(R3)+,Y1 >X1* Y1+A,wA, update X1,Y1 
Before Execution After Execution 
0000 0100 
LC LC 
0100 0100 
XO X0 
Explanation of Example: Prior to execution, the 16-bit XO register contains the value $0100 and the 


16-bit loop counter (LC) register contains the value $0000. Execution of the REP XO instruction takes the 
16-bit value in the XO register and stores it in the 16-bit loop counter (LC) register. Thus, the single word 
MAC instruction immediately following the REP instruction is repeated $100 times. 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *;} * | *|S1/S0)/ 11/10; S;L)/E;)U)N) 2) VC 


L — Set if data limiting occurred using A or B as source operands 
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| 
R E P Repeat Next Instruction R E P 


Instruction Format and Opcode: 


REP X:(Rn) 
15 12 11 8 7 4 3 0 Rn RR 
00 0 O10 OO Of 44 —l—>e FR RO 00 
R1 01 
R2 10 
R3 11 
“—” = don’t care 
REP #XX 
15 12 11 8 7 4 3 0 
#XX: immediate 8-bit 
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R E P Repeat Next Instruction R E P 


S |DDDDD |S DDDDD |S DDDDD |S DDDDD 
XO }00000 |SR 01001 |RO 10000 |SSH|11000 
YO ;00001 {}|OMR|01010 |RI1 10001 |SSL|11001 
X1|}00010 |SP 01011 =|R2 10010 |LA 11010 
Y1}00011 |At1 01100 |R3 10011 |LC 01000 
A |00100 |BI1 01101 |MO 10100 |NO 11100 
B {|00101 |A2 01110 |M1 10101 |N1 11101 
AO |00110 /|B2 01111 #|M2 10110 |N2 11110 
BO |00111 M3 10111 |N3 11111 
Timing: 6 + mv oscillator clock cycles if the argument equals zero; 
otherwise it is 4 + mv oscillator clock cycles 
Memory: 1 program words 
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REPcc Repeat Next Instruction Conditionally REPcc 


Operation: Assembler Syntax: 
Repeat next instruction until cc is true REPcc 


Description: Repeat the single word instruction immediately following the REPcc instruction until the 
specified condition is true. The instruction immediately following will not be executed if the 
condition is true on entry. No new value is loaded into the 16-bit loop counter (LC) register. 
When the REPcc instruction is in effect, the repeated instruction is fetched only one time 
and it remains in the instruction register until the specified condition is true. Thus, the 
REPcc instruction is not interruptible. 


The term “cc” may specify the following conditions: 
“cc” Mnemonic Condition 


CC (HS) —carry clear (higher or same) 
CS (LO) —carry set(lower) 

— extension clear 

— equal 

— extension set 

— greater than or equal 

— greater than 


— limit clear 

— less than or equal 
— limit set 

— less than 

— minus 

— not equal 

— normalized 

— plus 

— not normalized 


® 


N= 
So _ 


ad2d 
mo Mm 
1 


ll 
oO 


denotes the logical OR operator, 
denotes the logical AND operator, 
denotes the logical Exclusive OR operator 


where: U denotes the logical complement of U, 
+ 


® 


Restrictions: 

The REPcc instruction can repeat any single word instruction except the REPcc instruction itself and any 
instruction that changes program flow. The following instructions are not allowed to follow a REPcc instruc- 
tion: 


Immediately after REPcc DO Bec, Jcc DEBUG, DEBUGcc 
JCLR BRA, JMP Tec 
BScc, JScc BSR, JSR 
BRKcc Tcc 
REP, REPcc RTI 
RTS STOP 
SWI WAIT 
move to SSH any write to memory 
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REPcc Repeat Next Instruction Conditionally REPcc 


Also, a REPcc instruction cannot be the last instruction in a DO loop (at LA). The assembler will generate 
an error if any of the above instructions are found immediately following a REP instruction. 


Example: 


REPNR srep until normalized 
NORM R1,A 


Explanation of Example: This example illustrates a conditional REP instruction. The NORM instruction 
will be repeated until the accumulator A is normalized. 


Condition Codes: 
The condition codes are not affected by this instruction. 


Instruction Format and Opcode: 
REPcc expr 


15 12 11 8 7 4 83 0 


0 00 0/0 0 0 1/0 10 1/c c cic 


Instruction Field for the second word: 


cc = 4-bit condition code = CCCC 


Mnemonic Cc CC Mnemonic cc Cc 

CC(HS) 0 0 0 CS(LO) 0 0 0 

GE 0 Oo 1 LT 0 0 1 

NE 0 1 =0 EQ 0 1 0 

PL Oo 1 1 MI 0 1 1 

NN 1 0 0 NR 1 0 0 

EC 1 0 1 ES 1 0 1 

LC 1 1 £0 LS 1 1 =#0 

GT 1 $11 LE 1 1 1 
Timing: 4 oscillator clock cycles when condition true on entry 

6 oscillator clock cycles when condition false on entry 
Memory: 1 program word 
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RESET RESET On-Chip Peripherals RESET 


Operation: Assembler Syntax: 


Reset the Interrupt Priority Register and all on-chip peripherals RESET 


Description: Reset the Interrupt Priority Register and all on-chip peripherals. This is a software reset 
which is not equivalent to a hardware reset since only on-chip peripherals and the interrupt 
structure are affected. The processor state is not affected and execution continues with the 
next instruction. All interrupt sources are disabled except for the trace, stack error, and re- 
set interrupts. 

Restrictions: 

A RESET instruction cannot be the last instruction in a DO loop (at LA). 

Example: 


RESET ;reset all on-chip peripherals and IPR, set 11,10 
Explanation of Example: [Execution of the RESET instruction resets all on-chip peripherals and the In- 
terrupt Priority Register (IPR). 


Condition Codes Affected: 
The condition codes are not affected by this instruction. 
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| 
RESET RESET On-Chip Peripherals RESET 


Instruction Format: 


RESET 
Opcode: 


15 12 11 8 7 4 3 0 


0 00 0/0 00 0/0 0 0 0;1 0 0 0 


Instruction Fields: none 


Timing: 4 oscillator clock cycles 
Memory: 1 program word 
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RND Round Accumulator RND 


Operation: Assembler Syntax: 


D+r—> D (parallel move) RND D (parallel move) 


Description: Round the 40-bit value in the specified destination operand D and store the result in the 
MSP portion of the destination accumulator (A1 or B1). This instruction uses the rounding 
technique selected by the R bit in the Operating Mode Register (OMR). When the R bit in 
OMR is cleared (default mode), the convergent rounding is selected. When the R bit of 
OMR is set, the twos-complement rounding is selected. The value of the rounding constant 
added is determined by the scaling mode bits SO and S1 in the system status register (SR). 
Refer to Section 3.2.5 for more information about the rounding modes. 


Example: 
RND A B,Y1 ;sround A accumulator into A1, zero AO, save B1 first 
Before Execution After Execution 
| 00 1236 789A 00 1236 0000 
A2 Al AO A2 Al AO 
Before Execution After Execution 
| 00 1236 8000 00 1236 0000 
A2 Al AO A2 Al AO 
Before Execution After Execution 
Ill} 00 1235 8000 00 1236 0000 
A2 Al AO A2 Al AO 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $00:1236:789A 
for Case I, the value $00:1236:8000 for Case II and the value $00:1235:8000 for Case III. 
Execution of the RND A instruction rounds the value in the A accumulator into the MSP por- 
tion of the A accumulator (A1) and then zeros the LSP portion of the A accumulator (AO). 
The example is given assuming that the convergent rounding is selected. Note that case II 
is the special case that distinguishes convergent rounding from the twos complement 
rounding. 
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| 
RND Round Accumulator RND 


Condition Codes: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *;} * | *|S1/S0/ 11/10; S;/L)/E)U)N)Z) VIC 


— Computed according to the standard definition (see section A.4) 
— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of A or B result is in use 

— Set according to the standard definition of the U bit 

— Set if bit 39 of Aor B result is set 

— Set if AorB result equals zero 

— Set if overflow has occurred in A or B result 


<NZCMLON 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: 
RND D (parallel move) 
Opcode: 
15 12 11 8 7 4 3 0 


1 m R R}JH H H W]}0 0 1 0);F 0 0 0 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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R O L Rotate Left R O L 


Assembler Syntax: 


ROL D (parallel move) 

Operation: 
C< | unch. <—— unchanged (parallel move) 
| D2 D1 DO 


Description: Rotate bits 31-16 of the destination operand D one bit to the left and store the result in the 
destination accumulator. Bit 31 of D prior to instruction execution is shifted into the carry bit 
C and the value in the carry bit C prior to instruction execution is shifted into bit 16 of the 
destination accumulator D. This instruction is a 16-bit operation. The remaining bits of the 
destination operand D are not affected. 


Example: 
ROL A (R3)- ;rotate A1 left one bit, update R3 
Before Execution After Execution 
0000 1234 0001 1234 
A2 Al AO A2 Al AO 
0001 0000 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value 
$FE:0000:1234. Execution of the ROL A instruction shifts the 16-bit value in the A1 register 
one bit to the left, shifting bit 31 into the carry bit C, rotating the carry bit C into bit 16, and 
storing the result back in the A1 register. 
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NCL 


R O L Rotate Left R O L 


Condition Codes: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *; * | *|S1/S0/ 11/10; S;L)/E;)U)N)Z) vic 


S — Computed according to the standard definition (see section A.4) 

L  — Set if data limiting has occurred during parallel move 

N — Set if bit 31 of A or B result is set 

Z — Set if bits 31-16 of A or B result are zero 

V — Always cleared 

C — Set if bit 31 of A or B was set prior to instruction execution 
Instruction Format: 

ROL D (parallel move) 
Opcode: 
15 12 11 8 7 4 3 0 


Instruction Fields: Please see the “X Memory Data Move” description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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ROR Rotate Right ROR 


Assembler Syntax: 


ROR D (parallel move) 
Operation: 
— t 
C—! | unch. ——> unchanged (parallel move) 
| D2 D1 Do 


Description: Rotate bits 31-16 of the destination operand D one bit to the right and store the result in the 
destination accumulator. Bit 16 of D prior to instruction execution is shifted into the carry bit 
C and the value in the carry bit C prior to instruction execution is shifted into bit 31 of the 
destination accumulator D. This instruction is a 16-bit operation. The remaining bits of the 
destination operand D are not affected. 


Example: 
ROR B (R2)+N2 ;rotate B1 right one bit, update R2 
Before Execution After Execution 
0001 1234 0000 1234 
B2 B1 BO B2 B1 BO 
0000 0005 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit B accumulator contains the value $00:0001 :1234. 
Execution of the ROR B instruction shifts the 16-bit value in the B1 register one bit to the 
right, shifting bit 16 into the carry bit C, rotating the carry bit C into bit 31, and storing the 
result back in the B1 register. 
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NCL 


ROR Rotate Right ROR 


Condition Codes: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *; * | *|S1/S0/ 11/10; S;L)/E;)U)N)Z) vic 


S — Computed according to the standard definition (see section A.4) 

L — Set if data limiting has occurred during parallel move 

N — Set if bit 31 of A or B result is set 

Z — Set if bits 31-16 of A or B result are zero 

V — Always cleared 

C — Set if bit 16 of A or B was set prior to instruction execution 
Instruction Format: 

ROR D 
Opcode: 
15 12 11 8 7 4 3 0 


Instruction Fields: Please see the “X Memory Data Move” description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


D F 
A 0 
B 1 
Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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RTI Return from Interrupt RTI 


Operation: Assembler Syntax: 


SSH — PC; SSL > SR; SP-1 > SP RTl 


Description: Pull the program counter (PC) and the status register (SR) from the system stack. The pre- 
vious program counter and status register are lost. 


Restrictions: 
Due to pipelining in the program controller and the fact that the RTI instruction accesses certain program 
controller registers, the RTI instruction must not be immediately preceded by any of the following instruc- 
tions: 
Immediately before RTI MOVE(C) to SR, SSH, SSL, or SP 

MOVE(C) from SSH 

ANDI MR or ANDI CCR 

ORI MR or ORI CCR 


An RTI instruction cannot be the last instruction in a DO loop (at LA). 
An RTI instruction cannot be repeated using the REP instruction. 
Example: 


RTI ;pull PC and SR from the system stack 


Explanation of Example: The RTI instruction pulls the 16-bit program counter (PC) and the 16-bit status 
register (SR) from the system stack and updates the system stack pointer (SP). 


Condition Codes Affected: 


l< MR Cau CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 =O 


LF; *; * | *|S1/S0/ 11/10; S;L)/E;)U;)N)Z)vic 


— Set according to the value pulled from the stack 
— Set according to the value pulled from the stack 
— Set according to the value pulled from the stack 
— Set according to the value pulled from the stack 
Set according to the value pulled from the stack 
— Set according to the value pulled from the stack 
— Set according to the value pulled from the stack 
— Set according to the value pulled from the stack 


O<NZCmMro 
| 
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RTI Return from Interrupt RTI 


Instruction Format: 


RTI 

Opcode: 

15 12 11 8 7 4 3 0 

0 00 0/0 00 0/0 0 0 0/0 1 1 41 
Timing: 4 + rx oscillator clock cycles 
Memory: 1 program word 
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RTS Return from Subroutine RTS 


Operation: Assembler Syntax: 


SSH — PC; SP-1— SP RTS 


Description: Pull the program counter (PC) from the system stack. The previous program counter is lost. 
The status register (SR) is not affected. 


Restrictions: 


Due to pipelining in the program controller and the fact that the RTS instruction accesses certain program 
controller registers, the RTS instruction must not be immediately preceded by any of the following instruc- 
tions: 


Immediately before RTS MOVE(C) to SSH, SSL, or SP 
MOVE(M) to SSH, SSL, or SP 
MOVE(P) to SSH, SSL, or SP 
MOVE(C) from SSH 
MOVE(M) from SSH 
MOVE(P) from SSH 
An RTS instruction cannot be the last instruction in a DO loop (at LA). 
An RTS instruction cannot be repeated using the REP instruction. 


Example: 
RTS ;pull PC from the system stack 
Explanation of Example: The RTS instruction pulls the 16-bit program counter (PC) from the system 
stack and updates the system stack pointer (SP). 


Condition Codes Affected: 
The condition codes are not affected by this instruction. 
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RTS Return from Subroutine RTS 


Instruction Format: 


RTS 

Opcode: 

15 12 11 8 7 4 3 0 

0 00 0/0 00 0/0 0 0 0;);0 1 1 +0 
Timing: 4 + rx oscillator clock cycles 
Memory: 1 program word 
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S BC Subtract Long with Carry S BC 


Operation: Assembler Syntax: 


D-S-C> D (parallel move) SBC §S,D (parallel move) 


Description: Subtract the source operand S and the carry bit C of the condition code register from the 
destination operand D and store the result in the destination accumulator. Long words (82 
bits) may be subtracted from the (40-bit) destination accumulator. 


Note: The carry bit is set correctly for multiple precision arithmetic using long word operands if the exten- 
sion register of the destination accumulator (A2 or B2) is the sign extension of bit 31 of the destination ac- 
cumulator (A or B). 


Example: 
; 64 bit substraction: Y1:Y0:X1:X0 - B2:B1:B0:A1:A0 = B2:B1:B0:A1:A0 
SUB X,A ;subtract LS words 
SBC Y,B ;subtract MS words with carry 
B Before Execution A Before Execution 
00 0000 0003 00 0000 0000 
B2 B1 BO A2 Al AO 
0000 0001 8000 0000 
Y1 YO x1 X0 
(Y1:Y0 not affected by operation) (X1:X0 not affected by operation) 
B After Execution A After Execution 
00 0000 0001 00 8000 0000 
B2 B1 BO A2 Al AO 


Explanation of Example: This example illustrates long word double precision (64-bit) subtraction using 
the SBC instruction. Prior to execution of the SUB and SBC instructions, the 64-bit value 
$0000:0001 :8000:0000 is loaded into the Y and X registers (Y:X), respectively. The other 
double precision 64-bit value $0000:0003:0000:0000 is loaded into the B and A accumula- 
tors (B:A), respectively. Since the 32-bit value loaded into the A accumulator is automati- 
cally sign extended to 40-bits and the other 32-bit long word operand is internally sign ex- 
tended to 40-bits during instruction execution, the carry bit will be set correctly after the ex- 
ecution of the SUB X,A instruction. The SBC Y,B instruction then produces the correct MS 
40-bit result. 
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| 
SBC Subtract Long with Carry SBC 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *;} * | */S1/S0/ 11/10; S;/L)/E;)U;)N)Z)vic 


— Computed according to the standard definition (see section A.4) 
— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of A or B result is in use 

— Set according to the standard definition of the U bit 

Set if bit 39 of A or B result is set 

— Set if AorB result equals zero. Cleared otherwise 

— Set if overflow has occurred in A or B result 

— Set if a carry (or borrow) occurs from bit 39 of A or B result 


QO<NZCMmrOD 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: 
SBC S,D (parallel move) 
Opcode: 


Instruction Fields: Please see the “X Memory Data Move” description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


S,D J F 
XA 0 0 
X,B 0 1 
Y,A 1 0 
Y,B 1 1 
Timing: 2+mv oscillator clock cycles 
Memory: 1 program word 
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STOP Stop Instruction Processing STOP 


Operation: Assembler Syntax: 


Enter the STOP processing state and stop the clock oscillator STOP 


Description: Enter the STOP processing state. All activity in the processor is suspended until the RESET 


or IRQA pin is asserted. The STOP processing state is a low-power standby mode. 


During the STOP state, port A is in an idle state with the control signals held inactive (i.e., 
PS/DS=VCC etc.), the data pins (DO-D23) are high impedance, and the address pins (A1- 
A15) are unchanged from the previous instruction. If the bus grant was asserted when the 
STOP instruction was executed, port A will remain three-stated until the DSP exits the 
STOP state. 


When the exit from the stop state is caused by a low level on the RESET pin, then the pro- 
cessor will enter the reset processing state. The time to recover from the STOP state using 
RESET will depend on a clock stabilization delay controlled by the SD bit in the OMR. 


When the exit from the stop state is caused by a low level on the IRQA pin, then the pro- 
cessor will service the highest priority pending interrupt and will not service the IRQA inter- 
rupt unless it is highest priority. The interrupt will be serviced after an internal delay counter 
counts 524,284 clock phases (i.e., [2'9-4]T) or 28 clock phases (i.e., [2°-4]T) delay if the 
stop delay (SD) bit in the OMR is set to one. During this clock stabilization count delay, all 
peripherals and external interrupts are cleared and re-enabled/arbitrated at the start of the 
17T period following the count interval. The processor will resume program execution at the 
instruction following the STOP instruction that caused the entry into the stop state after the 
interrupts have been serviced or, if no interrupt was pending, immediately after the delay 
count plus 17T. If the IRQA pin is asserted when the STOP instruction is executed the in- 
ternal delay counter will be started. Refer to Section 7.6 for details on the STOP mode. 


Restrictions: 


— A STOP instruction cannot be used in a fast interrupt routine. 
— A STOP instruction cannot be the last instruction in a DO loop (i.e., at LA). 
— A STOP instruction cannot be repeated using the REP instruction. 


Example: 


STOP ;enter low-power standby mode 
Explanation of Example: The STOP instruction suspends all processor activity until the processor is re- 
set or interrupted as previously described. The STOP instruction puts the processor in a 
low-power standby mode. 


Condition Codes Affected: 
The condition codes are not affected by this instruction. 
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STOP Stop Instruction Processing STOP 


Instruction Format: 


STOP 
Opcode: 


0 00 0/0 00 0/0 0 0 0;1 0 1 #0 


Instruction Fields: None 


Timing: The STOP instruction disables internal distribution of the clock. 
Memory: 1 program word 
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SU B Subtract SU B 


Operation Assembler Syntax: 
D-S—> D (parallel move) SUB §S,D (parallel move) 
D-S—> D (two parallel reads) SUB §S,D (two parallel reads) 


Description: Subtract the source operand S from the destination operand D and store the result in the 
destination operand D. Words (16 bits), long words (32 bits) and accumulators (40 bits) 
may be subtracted from the destination accumulator. 


Note: The carry bit is set correctly using word or long word source operands if the extension register of 
the destination accumulator (A2 or B2) is the sign extension of bit 31 of the destination accumulator (A or 
B). The carry bit is always set correctly using accumulator source operands. 


Example: 
SUB X1,A X:(R2)+N2,X0 316-bit subtract, load X0, update R2 
Before Execution After Execution 
00 0058 1234 00 0055 1234 
A2 Al AO A2 Al AO 
0003 0003 
x1 x1 


Explanation of Example: Prior to execution, the 16-bit X1 register contains the value $0003 and the 40- 
bit A accumulator contains the value $00:0058:1234. The SUB instruction automatically ap- 
pends the16-bit value in the X1 register with 16 LS zeros, sign extends the resulting 32-bit 
long word to 40 bits, and subtracts the result from the 40-bit A accumulator. Thus, 16-bit 
operands are subtracted from the MSP portion of A or B (A1 or B1) because all arithmetic 
instructions assume a fractional, two’s complement data representation. Note that 16-bit 
operands can be subtracted from the LSP portion of A or B (AO or BO) by loading the 16-bit 
operand into XO or YO, forming a 32-bit word by loading X1 or Y1 with the sign extension 
of XO or YO, and executing a SUB X,A or SUB Y,A instruction. 
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SU B Subtract SU B 


Condition Codes: 


MR CCR 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


— Computed according to the standard definition (see section A.4) 
— Set if limiting (parallel move) or overflow has occurred in result 
— Set if the signed integer portion of A or B result is in use 

— Set according to the standard definition of the U bit 

Set if bit 39 of A or B result is set 

— Set if AorB result equals zero 

— Set if overflow has occurred in A or B result 

— Set if a carry (or borrow) occurs from bit 39 of A or B result 


O<NZCMrO@ 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 


Instruction Format: SUB S,D (parallel move) 
Opcode: 
15 12 11 8 7 4 3 0 


imarinwwwletoolru sy 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, W, and mm data fields. 


Instruction Format: SUB S,D (two parallel reads) 
Opcode: 
15 12 11 8 7 4 3 0 


0 1414mim K K KIO +r r u}F ou eu eu 


Instruction Fields: Please see the “Dual X Memory Data Read” description in the parallel move sec- 
tion for details on the mm and KKK data fields. 


One Parallel Operation Two Parallel Reads 

S,D JJJ F/S,D JJJ F S,D uuuu F |S,D uuuu F 
B.A 000 0j|xX0,B/100 1 X0A {0100 0/Y1,B};0111 1 
A,B 000 11]Y0A/101 0O X0,B }0100 1 
X,A 010 0jY0,3B/101 1 YOA };0101 O |B,A 1101 O 
X,B 010 14X1,A/110 O YO,B/0101 1 |A,B 1101 1 
Y,A 011 #0j,X1,B/110 1 X1,A;0110 0 
Y,B 011 1/;Y1,A/111 #0 X1,B/0110 1 
X0A {100 0/Y1,B }111 1 Y1,A;0111 O 

Timing: 2+ mv oscillator clock cycles 

Memory: 1 program word 
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SUBL 


Operation: 


D*2-S5D 
Description: 


Example: 


SUBL 


Before Execution 


SUBL 


Shift Left and Subtract Accumulators 


Assembler Syntax: 


(parallel move) SUBL §,D 


Subtract the source operand S from two times the destination operand D and store the re- 
sult in the destination accumulator. The destination operand D is arithmetically shifted one 
bit to the left and a zero is shifted into the LS bit of D prior to the subtraction operation. The 
carry bit is set correctly if the source operand does not overflow as a result of the left shift 
operation. The overflow bit may be set as a result of either the shifting or subtraction oper- 
ation (or both). This instruction is useful for efficient divide and decimation in time (DIT) FFT 
algorithms. 


(parallel move) 


B,A X:(R3)+,X1 ;A*2-B-—A, updateX1 and R3 


After Execution 


00 0000 2468 00 0000 369C 
A2 Al AO A2 Al AO 
00 0000 1234 00 0000 1234 
B2 Bi BO B2 Bi BO 


Explanation of Example: 


Prior to execution, the 40-bit A accumulator contains the value $00:0000:2468 
and the 40-bit B accumulator contains the value $00:0000:1234. The SUBL B.A instruction 
subtracts the value in the B accumulator from two times the value in the A accumulator and 
stores the 40-bit result in the A accumulator. 


Condition Codes Affected: 


Note: 


MR CCR 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


Computed according to the standard definition (see section A.4) 

Set if limiting (parallel move) or overflow has occurred in result 

Set if the signed integer portion of A or B result is in use 

Set according to the standard definition of the U bit 

Set if bit 39 of A or B result is set 

Set if A or B result equals zero 

Set if overflow has occurred in A or B result or if the MSB of the destination 
operand is changed as a result of the instruction’s left shift. 

Set if a carry (or borrow) occurs from bit 39 of A or B result 


<NZCMro 
| 


G = 


The definition of the E and U bits varies according to the scaling mode being used. Please refer to 


Section A.4 entitled “Condition Code Computation” for complete details. 
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S U B L Shift Left and Subtract Accumulators S U B L 


Instruction Format: 


SUBL $,D (parallel move) 
Opcode: 


15 12 11 8 7 4 3 0 


1 m R RjJH H H W/O 1 0 O;}F 0 0 1 


Please see the “X Memory Data Move’ description in the parallel move section for 


Instruction Fields: 
details on the m, RR, HHH, W, and mm data fields. 


Instruction Fields 


Timing: 2 + mv oscillator clock cycles 
Memory: 1 program word 
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SWA P Swap Accumulator Words SWA P 


Operation: Assembler Syntax: 


Di< DO (no parallel move) SWAP D (no parallel move) 


Description: | Exchange MS word and LS words of destination accumulator. The extension register is not 
affected by this instruction. 


Example: 
SWAP A 
Before Execution After Execution 
FE 0000 1234 FE 1234 0000 
A2 Al AO A2 Al AO 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value 
$FE:0000:1234. Execution of the SWAP A instruction exchange the 16-bit value in the A1 
register with the 16-bit value in the AO register. 


Condition Codes Affected: 
The condition codes are not affected by this instruction. 
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SWA P Swap Accumulator Words SWA P 


Instruction Format: 


SWAP D 
Opcode: 


000 %14;0 10 %41;/;0 141 41 1)F 0 0 1 


Instruction Fields: 


D F 
A 0 
B 1 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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SWI Software Interrupt SWI 


Operation: Assembler Syntax: 


Begin SWI exception processing SWI 


Description: Suspend normal instruction execution and begin SWI exception processing. The interrupt 
priority level (I1,10) is set to 3 in the status register (SR) if a long interrupt service routine is 
used. 

Restrictions: 


— A SWI instruction cannot be used in a fast interrupt routine. 


— A SWI instruction cannot be repeated using the REP instruction. 
Example: 


SWI sbegin SWI exception processing 
Explanation of Example: The SWI instruction suspends normal instruction execution and initiates SWI 
exception processing. 


Condition Codes Affected: 
The condition codes are not affected by this instruction. 
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| 
SWI Software Interrupt SWI 


Instruction Format: 


SWI 
Opcode: 


15 12 11 8 7 4 3 0 


0 00 0/0 0 0 0/0 0 0 0;0 1 0 1 


Instruction Fields: none 


Timing: 8 oscillator clock cycles 
Memory: 1 program word 
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Tec Transfer Conditionally Tce 


Operation: Assembler Syntax: 
If cc, then S > D Tec (S,D) 
If cc, then S — D and RO > Rn Tec §S,D RO,Rn 


Description: Transfer data from the specified source register S1 to the specified destination accumulator 
D1 if the specified condition is true. If a second source register RO and a second destination 
register Rn are also specified, transfer data from address register RO to address register 
Rn if the specified condition is true. If the specified condition is false, a NOP is executed. 
When used after the CMP or CMPM instructions, the Tcc instruction can perform many use- 
ful functions such as a “maximum value”, “minimum value”, “maximum absolute value”, or 
“minimum absolute value” function. The desired value is stored in the destination accumu- 
lator D. If address register RO is used as an address pointer into an array of data, the ad- 
dress of the desired value is stored in the address register Rn. The Tcc instruction may be 
used after any instruction and allows efficient searching and sorting algorithms. Transfer- 
ring A to A or B to B conditionally updates a register without affecting the ALU registers. 


The term “cc” may specify the following conditions: 


“cc” Mnemonic Condition 


CC (HS) —carry clear (higher or same) 
CS (LO) —carry set(lower) 

— extension clear 

— equal 

— extension set 

— greater than or equal 

— greater than 


— limit clear 

— less than or equal 
— limit set 

— less than 

— minus 

— not equal 

— normalized 

— plus 

— not normalized 


® 


N= 
on 


aac 
mo Mm 
rn 


Il 
oO 


denotes the logical OR operator, 
denotes the logical AND operator, 
® denotes the logical Exclusive OR operator 


where: U denotes the logical complement of U, 
+ 


Note: This instruction is considered to be a move-type instruction. Due to pipelining, if an address register 
(RO-R3) is changed using a move-type instruction, the new contents of the destination address register will 
not be available for use during the following instruction (i.e., there is a single instruction cycle pipeline delay). 
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Tec Transfer Conditionally Tee 
Example: 

CMP X0,A ;compare XO and A (sort for minimum) 

TLT X0,A RO,R1 itransfer XO > A and RO > R11 if X0<A 


Explanation of Example: In this example, the contents of the 16-bit XO register are transferred to the 40- 
bit A accumulator and the contents of the 16-bit RO address register are transferred to the 
16-bit R1 address register if the specified condition is true. If the specified condition is not 
true, a NOP is executed. 


Condition Codes Affected: 
The condition codes are not affected by this instruction. 


Instruction Format: 


Tec $1,D1 RO,Rn 
Opcode: 
15 12 11 8 7 4 3 0 


Instruction Fields: 


[SD ]hOh F/S,D |hOh F| TT Rn 
xo,A |100 OjAA* |001 0 00 RO 
X0,B |100 1/AB |000 1 01 RI 
YoA |101 O|BA |000 oO 10 Ro 
Yo,B |101 1/BB |001 1 11 R3 


* Encoding used by the assembler when no Data ALU transfer is specified in the instruction 
cc = 4-bit condition code = CCCC 


Mnemonic Cc ¢ Cc Mnemonic Cc CC 
CC(HS) 0 0 0 CS(LO) 0 0 O 
GE 0 oO 1 LT 0 O 1 
NE 0 1 0 EQ 0 1 0 
PL Oo 1 1 MI 0 1 1 
NN 1 0 O NR 1 0 0O 
EC 1 O 1 ES 1 0 1 
LC 1 $10 LS 1 $1 40 
GT 1 1 1 LE 1 1 = 1 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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TFR Transfer Data ALU Register TFR 


Operation: Assembler Syntax: 
S—> D _ (parallel move) TFR §S,D (one parallel operation) 
S— D _ (two parallel reads) TFR §S,D (two memory reads) 


Description: Transfer data from the specified source Data ALU register S to the specified destination 
Data ALU accumulator D. TFR uses the internal Data ALU data paths and thus data does 
not pass through the data shifter/limiters. This allows the full 40-bit contents of one of the 
accumulators to be transferred into the other accumulator without data shifting and/or lim- 
iting. Moreover, since TFR uses the internal Data ALU data paths, parallel moves are pos- 
sible. The TFR instruction only affects the L or S condition code bits which can be set by 
data movement associated with the instruction’s parallel move operations. 


Example: 
TFR X1,A X:(RO)+,Y1  X:(R3)+N3,X0 smove X1 to A and 
;supdate Y1, XO, RO, R3 
Before Execution After Execution 
B5 0123 0123 00 4000 0000 
A2 Al AO A2 Al AO 
4000 4000 
x1 x1 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $B5:0123:0123 
and the 16-bit X1 register contains the value $4000. Execution of the TFR X1,A instruction 
moves the 16-bit value in X1 into the 40-bit A accumulator. 


Condition Codes Affected: 


I< MR > CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 #0 


LF}; *;} * | *|S1/S0/ 11/10; S;L}/E;U)N) 2) VC 


S — Set according to the standard definition for the S bit 
L  — Set if data limiting has occurred during parallel move 
- MOTOROLA 
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TFR Transfer Data ALU Register TFR 


Instruction Format: TFR S,D (parallel move) 
Opcode: 

15 12 11 8 7 4 3 0 

1 m R R}|H H H W]O 0 0 1]/F J J J 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, W, and mm data fields. 


Instruction Format: TFR S,D (two parallel reads) 
Opcode: 
15 12 11 8 7 4 3 0 


Instruction Fields: Please see the “Dual X Memory Data Read” description in the parallel move sec- 
tion for details on the mm and KKK data fields. 


One Parallel Operation Two Parallel Reads 

S,D |;JJJ F|SD |JJJ F S,D DD F |S,D DD F 
BA |}000 0/X0,B /100 1 X0,A 00 O |X1,A 10 O 
AB |000 11]Y0A |101 0O X0,B 00 1 |X1,B 10 1 
XA |010 0/Y0./101 1 YO,A 01 0 | Y1,A 11 0 
XB |010 1 |X1,A/110 0 YO,B 01 1 /|Y1,B 11 1 
YA |011 0/X1,B/110 1 
YB {011 1 {|Y1,A/111 =O 
X0A/100 0/Y1,B )/111 1 

Timing: 2 + mv oscillator clock cycles 

Memory: 1 program word 
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TFR(2) 


Operation: 


TFR(2) 


Two Word Data ALU Register Transfer 


Assembler Syntax: 


S— D __ (no parallel move) TFR(2) S,D (no parallel operation) 


Description: Transfer data from the specified source accumulator S to the specified 32-bit destination 
Data ALU register D. GDB and XDB are used for this transfer. The transferred data passes 


through the shifter/limiter; therefore, the L condition code bit will be affected. 
Example: 
TFR(2) 


A,X ‘move A to X1:X0 


Before Execution After Execution 


FF FFFF 0123 FF FFFF 0123 

A2 At AO A2 At AO 
1234 4567 FFFF 0123 
x1 X0 x1 XO 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $F F:FFFF:0123 
and the 32-bit X1:X0O register contains the value $1234:5678. Execution of the TFR A,X in- 
struction moves the 32-bit value in A into the 32-bit X (X1:X0) register. The L bit is not set. 


Condition Codes Affected: 


le MR >t 
15 14131211109 8 


LF}; *|} * | *|S1/S0/ 11/10; S;L)/E);U)N) 2) VC 


L — Set if data limiting has occurred 
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TFR(2) Two Word Data ALU Register Transfer TFR(2) 


Instruction Format: 


TFR(2) S,D 
Opcode: 


0 00 1/0 10 1/0 0 0 0;/F 0 0 J 


Instruction Fields: 


S,D J F 

A,X 0 0 

B,X 0 1 

A,Y 1 0 

B,Y 1 1 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 

A- 21 

HOUOROL! For Mord HSS EU GON Sfils product 5 


Go to: www.freescale.com 


NOL 


TFR(3) 


TFR(3) 


Transfer Data ALU Register 


Operation: Assembler Syntax: 
$1— D1 X:<ea>, D2 TFR(8) $1,D1  X:<ea>,D2 
$1— D1 S2, X:<ea> TFR(8) $1,D1 S2, X:<ea> 
Description: Transfer data from the specified source accumulator S to the specified 16-bit destination 
Data ALU register D with the specified memory parallel move. The TFR(8) instruction can 
affect the L condition code bit in two ways. The parallel move transfer goes through the 
shifter/limiter to the XDB.The register transfer uses the GDB and therefore only goes 
through a limiter and is not affected by the scaling mode. 
Example: 
TFR(8) A,X1 X:(RO)+,X0 ;move A1 to X1 and X:(RO) to XO, update RO 
6543 
X:(RO) 
Before Execution After Execution 
FF FFFF 0123 FF FFFF 0123 
A2 Al AO A2 Al AO 
1234 5678 FFFF 6543 
X1 X0 X1 XO 
Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value 


$FF:FFFF:0123. Execution of the TFR(3) A,X1 X:(RO)+,X0 instruction moves the 16-bit val- 
ue in A1 into the 16-bit X1 register and the 16-bit value located in X:(RO) into the 16-bit reg- 
ister XO. RO is then post-incremented by one. 


Condition Codes Affected: 


le MR >t 
15 14131211109 8 


LF} *| *| * |S1)/SO/ 1} 10; S|} L)E; USN] Z) VIC 
S — Computed according to the standard definition (see section A.4) 
L — Set if data limiting has occurred during the data transfer or the parallel move 
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TFR(3) Transfer Data ALU Register TFR(3) 


Instruction Format: 
TFR(3) $1,D1 X:<ea>,D2 
TFR(3) $1,D1 S2, X:<ea> 


Opcode and Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move 
section for details on the m, RR, HHH, and W data fields. 


15 12 11 8 7 4 3 0 
where “RR” refers to 
001 0);0 1m W|R RD D}F H H H an Address Register RO-R3 
Reg. WwW 
HHH |D2,S2|HHH |D2,S2] [$1,D1 DD F |S$1,D1 DD F oe ° 
A,X0 00 O |A,X1 10 O 
000 | X0 100 |A B,X0 00 1 /B,x1 10 1 
001 YO 101 |B A,Y0 01 0 /A,Y1 11 #O 
010 |X1 110 |AO B,YO 01 1 /BY1 11 1 ea m 
011 Y1 111 |BO (Rn)+ 0 
(Rn)+Nn | 1 
Timing: 2 +mv oscillator clock cycles 
Memory: 1 program word 
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TST Test Accumulator TST 


Operation: Assembler Syntax: 


S-0 (parallel move) TST S$ (parallel move) 


Description: | Compare the specified source accumulator S with zero and set the condition codes accord- 
ingly. No result is stored although the condition codes are updated. 


Example: 
TST A X:(RO)+NO,B __;set CCR bits for value in A, update B and RO 
Before Execution After Execution 
01 0203 0000 01 0203 0000 
A2 Al AO A2 Al AO 
0300 0330 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 40-bit A accumulator contains the value $01:0203:0000 
and the 16-bit condition code register (CCR) contains the value $0300. Execution of the 
TST A instruction compares the value in the A register with zero and updates the condition 
code register accordingly. The contents of the A accumulator are not affected. 


Condition Codes Affected: 


I< MR >* CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 =O 


LF}; *; * | *|S1/S0/ 11/10; S;/L)/E;)U;)N)Z) vic 


— Computed according to the standard definition (see section A.4) 
— Set if data limiting has occurred during parallel move 

— Set if the signed integer portion of A or B result is in use 

— Set according to the standard definition of the U bit 

Set if bit 39 of A or B result is set 

— Set if Aor B result equals zero 

— Always cleared 

— Always cleared 


O<NZCMrOD 
| 


Note: The definition of the E and U bits varies according to the scaling mode being used. Please refer to 
Section A.4 entitled “Condition Code Computation” for complete details. 
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TST Test Accumulator TST 


Instruction Format: 


TST Ss (parallel move) 
Opcode: 


1 m R RJH H H W/O 0 1 O;}F O 0 1 


Instruction Fields: Please see the “X Memory Data Move’ description in the parallel move section for 
details on the m, RR, HHH, and W data fields. 


Ss F 
A 0 
B 1 
Timing: 2+mv oscillator clock cycles 
Memory: 1 program word 
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TST(2) Test Data ALU Register TST(2) 


Operation: Assembler Syntax: 


S-0 (no parallel move) TST(2) S$ (no parallel move) 


Description: Compare the specified source Data ALU register S with zero and set the condition codes 
accordingly. No result is stored although the condition codes are updated. 


Example: 
TST(2) x1 ;set CCR bits for value in X1 
Before Execution After Execution 
0203 0203 
x1 x1 
0300 0310 
SR=MR:CCR SR=MR:CCR 


Explanation of Example: Prior to execution, the 16-bit XO register contains the value #$0203 and the 16- 
bit condition code register (CCR) contains the value $0300. Execution of the TST(2) XO in- 
struction compares the value in the XO register with zero and updates the condition code 
register accordingly. The contents of the XO register is not affected. 


Condition Codes Affected: 


I< MR >* CCR >| 
15 1413 12 1110 9 8|7 6 5 4 3 2 1 =O 


LF}; *;} * | *|S1/S0)/ 11/10; S;L]E;)U)N) 2) VIC 


U — Set if result is unnormalized 
N — Set if bit 31 of A or B result is set 
Z — Set if result equals zero 
C — Always cleared 
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TST(2) Test Data ALU Register TST(2) 


Instruction Format: 


TST S 
Opcode: 


000 %14;/;0 10 1/0 0 0 4};— 1 =D D 


“—” = don’t care 


Instruction Fields: 


Ss DD 
X0 00 
YO 01 
X1 10 
Y1 11 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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WAIT Wait for interrupt WAIT 


Operation: Assembler Syntax: 


Disable clocks to the processor core and enter the WAIT processing state. WAIT 


Description: Enter the WAIT processing state. The internal clocks to the processor core and memories 
are gated off and all activity in the processor is suspended until an unmasked interrupt oc- 
curs. The clock oscillator and the internal I/O peripheral clocks remain active. When an un- 
masked interrupt or external (hardware) processor RESET occurs, the processor leaves 
the WAIT state and begins exception processing of the unmasked interrupt or RESET con- 
dition. The WAIT state is a low-power standby mode. 


Restrictions: 


— A WAIT instruction cannot be used in a fast interrupt routine. 
— A WAIT instruction cannot be the last instruction in a DO loop (at LA). 
— A WAIT instruction cannot be repeated using the REP instruction. 


Example: 


WAIT enter low power mode, wait for interrupt 


Explanation of Example: The WAIT instruction suspends normal instruction execution and waits for an 
unmasked interrupt or external RESET to occur. 


Condition Codes Affected: 
The condition codes are not affected by this instruction. 
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WAIT Wait for interrupt WAIT 


Instruction Format: 


WAIT 
Opcode: 


0 00 0/0 0 0 0/0 0 0 0;1 0 1 = 1 


Instruction Fields: None 


Timing: If an internal interrupt is pending during the execution of the WAIT instruction, the WAIT 
instruction takes a minimum of 32T cycles to execute. If no internal interrupt is pending 
when the Wait instruction is executed, the period that the DSP is in the wait state is the pe- 
riod before the interrupt or reset causing the DSP to exit the wait state plus a minimum of 
28T cycles to a maximum of 31T cycles (see the Technical Data Sheet). 


Memory: 1 program word 
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ZERO 


Operation: 


0 - _ [bit 39-32] of D 


Description: 


Example: 


Zero Extend Accumulator 


Assembler Syntax: 


ZERO OD (no parallel move) 


Zero Extend the destination accumulator from bit 32 to bit 39 


ZERO 


A 


A Before Execution 


3456 0000 


Explanation of Example: 


A2 


Al 


A After Execution 


AO A2 Al 


ZERO 


AO 


oe [5 oon 


Prior to execution, the 40-bit A accumulator contains the value $FF:6432:0000. 


Execution of the ZERO instruction clears the extension bits 32-39 and returns 
$00:6432:0000 in A. 


Condition Codes Affected: 


<N2ZCM 


I< 


MR 


15 1413 12 1110 9 8}|7 6 5 4 3 2 1 =O 


LF} * | * 


* 


S1 


SO} 11} 10; S| L| E} UJ N} Z| VIC 


— Always cleared 
— Set according to the standard definition of the U bit 


— Always cleared 
— Set if Aor B result equals zero 
— Always cleared 
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ZERO Zero Extend Accumulator ZE RO 


Instruction Format: 


ZERO D 
Opcode: 


15 12 11 8 7 4 3 0 


0 00 1;/0 10 1/0 1:0 1;/F 0 0 0 


Instruction Fields: 


D F 
A 0 
B 1 
Timing: 2 oscillator clock cycles 
Memory: 1 program word 
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| INSTRUCTION TIMING 


A.6 INSTRUCTION TIMING 


This section describes how one can calculate the 16-bit DSP instruction timing manually 
using the tables provided in this section. Three complete examples are presented to illus- 
trate the “layered” nature of the tables. Alternatively, the user can obtain the number of 
instruction program words and the number of oscillator clock cycles required for a given 
instruction by using the 16-bit DSP simulator. This method of determining instruction tim- 
ing information is much faster and much simpler than using the aforementioned tables. 


The number of words per instruction is dependent on the addressing mode and the type 
of parallel data bus move operation specified. The symbols reference subsequent tables 
to complete the instruction word count. 


The number of oscillator clock cycles per instruction is dependent on many factors, includ- 
ing the number of words per instruction, the addressing mode, whether the instruction 
fetch pipe is full or not, the number of external bus accesses and the number of wait states 
inserted in each external access. The symbols reference subsequent tables to complete 
the execution clock cycle count. The following is a list of these tables and their purpose. 


¢ Table A-6 gives the number of instruction program words and the number of 
oscillator clock cycles for each instruction mnemonic. 


¢ Table A-7 gives the number of additional (if any) instruction words and 
additional (if any) clock cycles for each type of parallel move operation. 


¢ Table A-8 gives the number of additional (if any) clock cycles for each type of 
MOVEC operation. 


¢ Table A-9 gives the number of additional (if any) clock cycles for each type of 
MOVEM operation. 


* Table A-10 gives the number of additional (if any) clock cycles for each type of 
MOVEP operation. 


¢ Table A-11 gives the number of additional (if any) clock cycles for each type of 
bit field manipulation (BFCHG, BFCLR, BFSET, BFTSTH, and BFTSTL) 
operation. 


* Table A-12 gives the number of additional (if any) clock cycles for each type of 
branch/jump (Bcc, BRA, BSR, BScc, Jcc, JMP, JSR, and JScc) operation. 


* Table A-13 gives the number of additional (if any) clock cycles for the RTI and 
RTS instructions. 


* Table A-14 gives the number of additional (if any) instruction words and 
additional (if any) clock cycles for each effective addressing mode. 


* Table A-15 gives the number of additional (if any) clock cycles for external data, 
external program, and external I/O memory accesses. 
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INSTRUCTION TIMING 


All tables are based on the following assumptions. 


Assumptions: 
1. All instruction cycles are counted in oscillator clock cycles. 


2. The instruction fetch pipeline is full. 


3. There is no contention for instruction fetches. Thus, external program instruction 
fetches are assumed not to have to contend with external data memory accesses. 


4. There are no wait states for instruction fetches done sequentially (as for non- 
change-of-flow instructions), but they are taken into account for change-of-flow in- 
structions which flush the pipeline such as BRA/JMP, Bcc/Jcc, RTI, etc. 


In order to better understand and use the aforementioned tables, three examples are pre- 
sented prior to the actual tables. These examples attempt to illustrate the “layered” nature 
of the tables. 


Example 1: Arithmetic Instruction with 2 Parallel Reads 


Problem: Calculate the number of 16-bit instruction program words and the number 
of oscillator clock cycles required for the instruction 


MACR X1,Y0,A_ = X:(RO)+,YO X:(R3)+,X1 
where Operating Mode Register (OMR) =$02(normalexpanded memory map), 
Bus Control Register (BCR) = $20, 
RO Address Register = $0052 (internal X memory), and 
R3 Address Register = $0923 (external X memory). 


Solution: — To determine the number of instruction program words and the number of 
oscillator clock cycles required for the given instruction, the user should per- 
form the following operations: 


1. Look up the number of instruction program words and the number of oscillator 
clock cycles required for the opcode-operand portion of the instruction in Table A-6. 
According to Table A-6, the MACR instruction will require 1 instruction program 
word and will execute in (2 + mv) oscillator clock cycles. The term “mv” represents 
the additional (if any) instruction program words and the additional (if any) oscillator 
clock cycles that may be required over and above those needed for the basic 
MACR instruction due to the parallel move portion of the instruction. 


2. Evaluate the “mv” term using Table A-7. 
The parallel move portion of the MACR instruction consists of an XX Memory 
Read. According to Table A-7, the parallel move portion of the instruction will re- 
quire mv = axx additional oscillator clock cycles. The term “axx” represents the 
number of additional (if any) oscillator clock cycles that are required to access two 
operands in the X memory. 
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3. Evaluate the “axx” term using Table A-15. 
The parallel move portion of the MACR instruction consists of an XX Memory 


Read. According to Table A-15, the term “axx” depends upon where the referenced 
X memory locations are located in the 16-bit DSP memory space. External X mem- 
ory accesses require additional oscillator clock cycles according to the number of 
wait states programmed into the 16-bit DSP Bus Control Register (BCR). Thus, as- 
suming that the 16-bit Bus Control Register contains the value $20, external X 
memory accesses require wx = 1 wait state or additional oscillator clock cycle. For 
this example, the first X memory reference is assumed to be an internal reference 
while the second X memory reference is assumed to be an external reference. 
Thus, according to Table A-15, the XX memory reference in the parallel move por- 
tion of the MACR instruction will require axx = wx = 1 additional oscillator clock cy- 
cle. 


4. Compute final results. 
Thus, based upon the assumptions given for Table A-6 and those listed in the prob- 
lem statement for Example 1, the instruction 


MACR X1,Y0,A X:(RO)+,YO X:(R3)+,X1 


will require 1instruction program word and will execute in 
(2 + mv) = (2 + axx) = (2 + wx) = (2 + 1) = 3 oscillator clock cycles. 


Note that if a similar calculation were to be made for a MOVEC, MOVEM, MOVEP, or one 
of the bit field manipulation (BFCHG, BFCLR, BFSET, or BFTST) instructions, the use of 
Table A-7 would no longer be appropriate. For one of these cases, the user would refer 
to Table A-8, Table A-9, Table A-10, or Table A-11, respectively. 


Example 2: Jump Instruction 


Problem: Calculate the number of 16-bit instruction program words and the number 
of oscillator clock cycles required for the instruction 
JLC R2 
where Operating Mode Register (OMR) = $02 (normal expanded memory map), 
Bus Control Register (BCR) = $04, 
R2 Address Register= $2000 (external P memory) 


Solution: | To determine the number of instruction program words and the number of 
oscillator clock cycles required for the given instruction, the user should per- 
form the following operations: 

1. Look up the number of instruction program words and the number of oscillator clock 
cycles required for the opcode-operand portion of the instruction in Table A-6. 
According to Table A-6, the Jcc instruction will require (1 + ea) instruction program 
words and will execute in (4 + jx) oscillator clock cycles. The term “ea” represents 
the number of additional (if any) instruction program words that are required for the 
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effective address of the Jcc instruction. The term “jx” represents the number of ad- 
ditional (if any) oscillator clock cycles required for a jump-type instruction. 


2. Evaluate the “jx” term using Table A-12. 

According to Table A-12, the Jcc instruction will require jx = ea + (2 * ap) additional 
oscillator clock cycles. The term “ea” represents the number of additional (if any) os- 
cillator clock cycles that are required for the effective addressing mode specified in 
the Jcc instruction. The term “ap” represents the number of additional (if any) oscil- 
lator clock cycles that are required to access a P memory operand. Note that the “+ 
(2 * ap)” term represents the two program memory instruction fetches executed at 
the end of a one-word jump instruction to refill the instruction pipeline. 


3. Evaluate the “ea” term using Table A-14. 
The JLC R2 instruction uses the “No update” effective addressing mode. According 


to Table A-14, this operation will require ea = 0 additional instruction program words 
and ea = 0 additional oscillator clock cycles. 


4. Evaluate the “ap” term using Table A-15. 

According to Table A-15, the term “ap” depends upon where the referenced P mem- 
ory location is located in the 16-bit DSP memory space. External memory accesses 
require additional oscillator clock cycles according to the number of wait states pro- 
grammed into the 16-bit DSP Bus Control Register (BCR). Thus, assuming that the 
16-bit Bus Control Register contains the value $04, external P memory accesses 
require wp = 4 wait states or additional oscillator clock cycles. For this example, 
the P memory reference is assumed to be an external reference. Thus, according 
to Table A-15, the Jcc instruction will use the value ap = wp = 4 oscillator clock cy- 
cles. 


5. Compute final results. 
Thus, based upon the assumptions given for Table A-6 and those listed in the prob- 


lem statement for Example 2, the instruction 


JLC R2 


will require (1 + ea) = (1 + 0) = instruction program word 
and will execute in (4 + jx) = (4+ ea + (2* ap)) = (4+ ea + (2 * wp)) = (4+ 0+ (2* 4)) 
= 12 oscillator clock cycles. 


Example 3: RTI Instruction 


Problem: Calculate the number of 16-bit instruction program words and the number 
of oscillator clock cycles required for the instruction 


RT 


where Operating Mode Register (OMR) = $02 (normal expanded memory map), 
Bus Control Register (BCR) = $41, and 


Return Address (on the stack) = $0100 (internal P memory). 
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Solution: —To determine the number of instruction program words and the number of 
oscillator clock cycles required for the given instruction, the user should per- 
form the following operations: 


1. Look up the number of instruction program words and the number of oscillator clock 
cycles required for the opcode-operand portion of the instruction in Table A-6. 
According to Table A-6, the RTI instruction will require 1 instruction program word 
and will execute in (4 + rx) oscillator clock cycles. The term “rx” represents the num- 
ber of additional (if any) oscillator clock cycles required for an RTI or RTS instruc- 
tion. 


2. Evaluate the “rx” term using Table A-13. 
According to Table A-13, the RTI instruction will require rx = (2 * ap) additional os- 
cillator clock cycles. The term “ap” represents the number of additional (if any) os- 
cillator clock cycles that are required to access a P memory operand. Note that the 
term “(2 * ap)” represents the two program memory instruction fetches executed at 
the end of an RTI or RTS instruction to refill the instruction pipeline. 


3. Evaluate the “ap” term using Table A-15. 

According to Table A-15, the term “ap” depends upon where the referenced P mem- 
ory location is located in the 16-bit DSP memory space. External memory accesses 
require additional oscillator clock cycles according to the number of wait states pro- 
grammed into the 16-bit DSP Bus Control Register (BCR). Thus, assuming that the 
16-bit Bus Control Register contains the value $0041, external P memory accesses 
require wp = 1 wait state or additional oscillator clock cycles. For this example, the 
P memory reference is assumed to be an internal reference. This means that the 
return address ($0100) pulled from the system stack by the RTI instruction is in in- 
ternal P memory. Thus, according to Table A-15, the RTI instruction will use the val- 
ue ap = 0 additional oscillator clock cycles. 


4. Compute final results. 
Thus, based upon the assumptions given for Table A-6 and those listed in the prob- 


lem statement for Example 3, the instruction 
RT 


will require one instruction program word and will execute in 
(4 + rx) = (4 + (2 * ap)) = (4 + (2 * 0)) = 4 oscillator clock cycles. 
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Table A-6 Instruction Timing Summary 
Mnemonic Instruction | Osc. Mnemonic Instruction 
Program Clock Program 
Words Cycles Words 


2+mv JSR 1+ea 
2 LEA 
2+mv LSL 
2+mv LSR 
2 MAC 
2+mv MACR 
2 MAC (uu,su) 
2+mv 
2 
2 
4+mvb 
4+mvb 
4+mvb 
4+mvb 
4+mvb 
4+jx MPY(su,uu) 
4+jx NEG 
2/8 NEGC 
4+jx NOP 
4+jx NORM 
2 NOT 
2+mv OR 
2+mv ORI 
2+mv REP 
2+mv REPcc 
4 RESET 
4 RND 
2+mv ROL 
2+mv ROR 
2 RTI 
2 RTS 
6/10+mv SBC 
6 STOP 
2 SUB 
2+mv SUBL 
SWAP 
SWI 
Tcc 
TFR 
TFR(2) 
TFR(3) 
TST 
TST(2) 
WAIT 
ZERO 


POP PO PP —H AH AH AH AH AH SS St 


DOFOREVER 
ENDDO 
EOR 


A Se COR SOR OO eC Oe ee 
sa Sos i i i i 


Note 1: The STOP instruction disables the internal clock oscillator. After clock turn-on, an internal counter 
counts some 65,536 cycles before enabling the clock to the internal DSP circuits. 
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Note 2: The WAIT instruction takes a minimum of 16 cycles to execute when an internal interrupt is pend- 
ing during the execution of the WAIT instruction. 


Note 3: BRKcc executes in 8 clock cycles if cc is true. Otherwise it executes in 2 clock cycles. 


Note 4: The DO instruction executes in 10 clock cycles if the DO argument is equal to zero. In that case, 
the loop is skipped. Otherwise it executes in 6 clock cycles. 


Note 5: The REP instruction executes in 6 clock cycles if the argument is equal to zero. In that case, the 
repetition is skipped. Otherwise it executes in 4 clock cycles. 


Note 6: REPcc executes in 6 clock cycles if cc is true on entry. Otherwise it executes in 4 clock cycles. 
When the condition becomes true, 4 additional clock cycles are necessary to exit the REP. 


Table A-7 Parallel Data Move Timing 


Parallel Move operation Comments 


No Parallel Data Move 
Immediate Short Data 
Register to Register 
Address Reg. Update 
X Memory Move 
X Memory and Register 
XX Memory Read 


MOVEC Operation Comments 


Immediate > Register 
Register <> Register 
X Memory <> Register 


Table A-9 MOVEM Timing Summary 


+mvm 
MOVEM Operation Cycles Comments 


Register < P Memory 4+ea+ap 
X Memory < P Memory 4+ea+ap 


Note that the “ap” term present in Table A-9 represents the wait states spent when ac- 
cessing the program memory during DATA read or write operations and does not refer to 
instruction fetches. 
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Table A-10 MOVEP Timing Summary 


+mvp 
MOVEP Operation Cycles Comments 


Register < Peripheral aio 
X Memory < Peripheral ea + ax + aio 


Table A-11 Bit Field Manipulation Timing Summary 


+mvb 
Bit Manipulation Operation Cycles Comments 


BFxxx Peripheral 2* aio 

BFxxx X Memory ea + (2 * ax) 
BFTSTx Peripheral aio 
BFTSTx X Memory ea+ax 


where BFxxx = BFCHG, BFCLR, or BFSET 
and BFTSTx = BFTSTH or BFTSTL 
Table A-12 Branch/Jump Instruction Timing Summary 
+jx 
Branch/Jump Instruction Operation Cycles Comments 


Bxxx eab + (2* ap) 
JXXX ea + (2* ap) 


where Bxxx = Bcc, BRA, BScc, and BSR 
Jxxx = Joc, UMP, JScc, and JSR 


The one word branch instructions using the 6-bit signed address, as well as all one-word 
jump instructions, execute two program memory fetches to refill the pipeline which is rep- 
resented by the “+ (2 * ap)” term. 


For all other branch instruction, another instruction cycle (two clock cycles) is necessary 
to compute the new PC address from the relative address. 


All two-word jumps execute three program memory fetches to refill the pipeline but one 
of those fetches is sequential (the instruction word located at the jump instruction 2nd 
word address+1). If the jump instruction was fetched from program memory using wait 
states, another “ap” should be added to account for that third fetch. 
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Table A-13 RTI/RTS Timing Summary 


Operation + rx Cycles Comments 


RTI 2* ap 
RTS 2* ap 


The term “2 * ap” comes from the two instruction fetches done by the RTI/RTS instruction 
to refill the pipeline. 


Table A-14 Addressing Mode Timing Summary 


Effective Addressing Mode 


Address Register Indirect 


No Update 
Postincrement by 1 
Postdecrement by 1 

Post addition by Offset Nn 
Indexed by Offset Nn 
Predecrement by 1 


Special 


Immediate Data 
Absolute Address 
Immediate Short Data 
Short Branch Address 
Absolute Short Address 
I/O Short Address 
Implicit 
Indexed by short displacement 
Acc. Indirect Address 


oO+CD0000-—= 
mMmMooo|onn 
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Table A-15 Memory Access Timing Summary 


Access | X Mem 
Type Access 


Int 
Ext 


Int:Int 
Int:Ext WX 
Ext:Ext 2+2*WXx 
//0:/0 2 
1/O:Int 2 
1/O:Ext 2+2*WwXx 


where _—- wx = external X memory access wait states 
wp = external P memory access wait states 


where wx and wp are programmable from 0-31 wait states in the Port A Bus Control Reg- 
ister (BCR). 
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Table A-16 Dual Read Instructions 


DSP56100 Family 
DATA ALU eae DOUBLE 
OPERATION ADDRESS DESTINATION 
Oper. Reg. Read1 Read2 Dest1 Dest2 
MOVE (Rn)+ (R3)+ F XO 
MAC/R X1,Y1,F (Rn)+Nn (R3)+ YO XO 
MPY/R X1,Y0,F 
XO.YLE (Rn)+ (R3)+N3 x1 X0 
X0,Y0,F (Rn)+Nn (R3)+N3 Y1 XO 
n=[0,2] XO x1 
ADD X1,F F=0 >A YO x1 
SUB X0,F = 
F=1— 3B 
TFR Y1F = F YO 
YO,F v1 x1 
ADD FF 
SUB FF 
TFR FF 
Table A-17 LMS Instruction 
DSP56100 Family 
DATA ALU DOUBLE 
OPERATION TRANSFER 
Oper. Reg. TRANSFER1 TRANSFER2 
MAC X0,X0,F F (Rn)+Nn x1 F 
MPY = 
X1,X0,F XO F 
n=[0,2] = 
A1,Y0,F F-0 >A Y1 F 
B1,X0,F Feil >8 YO F 
F= Opposite accumulator 
Y0,X1,F 
Y1,X1,F 
Y1,X0,F 
Y0,X0,F 
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Table A-18 Data ALU Instructions with One Parallel Operation 


DSP56100 Family 


DATA ALU PARALLEL MEMORY 
OPERATION READ or WRITE 
Oper. Reg. Effective Address Dest/Source 
MAC +X0,X0,F (Rn)+ x1 
MPY +X1,X0,F (Rn)+Nn XO 
+A1,Y0,F (F1) Y1 
+B1,X0,F (R2+xx) YO 
+Y0,X1,F AO 
+Y1,X1,F BO 
+Y1,X0,F A 
+Y0,X0,F B 
ONE ADDRESS UPDATE 
ADD X1,F Effective Address 
SUB X0,F 
TFR Y1,F ony 
OR/AND YO,F (Rn)+Nn 
EOR 
CMP/CMPM PARALLEL REGISTER 
TRANSFER 
Source Destination 
X0 F 
ADD X,F x1 F 
SUB Y,F = 
YO F 
MOVE Y1 F 
SBC X,F A XO 
Y,F 
A x1 
CMP/CMPM F,F B YO 
SUBL, TFR 
ADD, SUB 2 iS 
RND F F F 
TST 
ABS AO XO 
INC/INC24 AO x4 
DEC/DEC24 
CLR/CLR24 BO YO 
NEG 
ASL/ASR BO v1 
NOT No Transfer 
ROL/ROR 
LSL/LSR 
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Table A-19 Bit Field Manipulation Instructions 


DSP56100 Family 


OPERATION OPERAND COMMENTS 
BFTSTH iiiii, X:(Rn) n=[0,3] 
BFTSTL iiii, ; 
BFCHG #iiii, X:<aa> First 32 words of X 
BFSET  #iiii, memory 5 bit address 
BFCLR iii, X:<pp> Last 32 words of X 
memory 5 bit address 

X1,X0,Y1,Y0, 

RO,R1,R2,R3, 

NO,N1,N2,N3 

MOo,M1,M2,M3 

A2,B2,A1,B1, 

AO,BO0,A,B 

SR,OMR,SP,SSH, 

SSL,LA,LC 


Table A-20 Effective Address Update 


DSP56100 Family 


OPERATION SOURCE DESTINATION 
ADDRESS REGISTER 
REGISTER 

LEA (Rn) RO,R1,R2,R3 
(Rn)+ NO,N1,N2,N3 
(Rn)- 
(Rn)+Nn 
n=[0,3] 


Table A-21 JUMP/BRANCH Instructions 


DSP56100 Family 
OPERATION OPERAND COMMENTS 
JSR (Rn) n=[0,3] 
JMP 
Jec $xXxxXx 16-bit absolute 
JScc address 
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Table A-21 JUMP/BRANCH Instructions 


DSP56100 Family 


OPERATION OPERAND COMMENTS 
BSR (Rn) n=[0,3] 
BRA ; 
Bcc $xXxX 16-bit absolute 
BScc address 
JSR AA 8-bit absolute 
address [0,256] 
BRA aa 8-bit PC relative 
address 
[-128,+127] 
Bcc ee 6-bit PC relative 
address 
[-32,+31] 
Table A-22 REP and DO Instructions 
DSP56100 Family 
OPERATION OPERAND COMMENTS 
REP X:(Rn) n=[0,3] 
DO a ; 
#xx 8-bit immediate 
short data 
X1,X0,Y1,Y0, 
RO,R1,R2,R3, 
NO,N1,N2,N3 
MO,M1,M2,M3 
A2,B2,A1,B1, 
AO,B0,A,B 
SR,OMR,SP,SSH, 
SSL,LA,LC 
REPcc 16 conditions 
DO FOREVER 
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Table A-23 Short Immediate Move Instructions 


DSP56100 Family 


OPERATION DESTINATION COMMENTS 
MOVE(l) #xx, X1 immediate short 8 bit 
X0 signed data 
Y1 (data is put in the 
YO LSByte) 
Table A-24 MOVE Program and Control Instructions 
DSP56100 Family 
OPERATION Source/Dest. Dest./Source COMMENTS 
MOVE(M) P:(Rn) A, AO, B, BO 
P:(Rn)+ XO, X1, YO, Y1 
P:(Rn)- 
P:(Rn)+Nn 
P:(R2+xx) 
MOVE(M) X:(Rn)+ P:(Rn)+ 
X:(Rn)+Nn P:(Rn)+Nn 
MOVE(C) X:(Rn) All registers X:HXXXX: 
X:(Rn)+ Long 16-bit 
X:(Rn)- absolute address 
X:(Rn)+Nn 
X:(Rn+Nn) FXXXX: 
X:-(Rn) Long 16-bit 
X:#XXXX immediate 
#XXXX data 
X:(A1) 
X:(B1) 
X:(R2+Xx) 
MOVE(C) All registers All registers 
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Table A-25 MOVE Absolute Short and 


MOVE Peripheral Instructions 


DSP56100 Family 
OPERATION Source/Dest. Dest./Source COMMENTS 
MOVE(S) X:<aa> A, B, First 32 word of X 
X0, YO memory 
5 bit address 
MOVE(P) X:<pp> A, B, Last 32 word of X 
X0, YO memory 
5 bit address 
X:(Rn)+ 
X:(Rn)+Nn 


Table A-26 Transfer with Parallel MOVE Instruction 


DSP56100 Family 


OPERATION REGISTER TRANSFER PARALLEL MOVE 
Source Destination Source/Dest. | Dest./Source 

TFR(3) A XO, X1, X:(Rn)+ X0,X1,Y0,Y1, 
B YO, Y1 X:(Rn)+Nn AO, BO, A, B 


Table A-27 Register Transfer without Parallel MOVE Instruction 


DSP56100 Family 


OPERATION SOURCE DESTINATION 
TFR(2) A X 
B Y 


Table A-28 Register Transfer Conditional MOVE Instruction 


DSP56100 Family 
OPERATION Data ALU Address Register 
Tcc A, F RO,RO 
B, F 
YO, F RO,Rm 
X0, F 
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Table A-29 Conditional Program Controller Instructions 


DSP56100 Family 


OPERATION 
BRKcc 
DEBUGcc 
Table A-30 Logical Immediate Instructions 
DSP56100 Family 
OPERATION DESTINATION COMMENTS 
ORI #XX, CCR 8 bit immediate data 
ANDI #XX, MR 
OMR 


Table A-31 Double Precision Data ALU Instructions 


DSP56100 Family 


DATA ALU 
OPERATION 
Operation sign unsigned 
DMAC Y1, XO F 
X1, Y1, F 
x1, YO, F 
XO, YO, F 
MPY(su,uu) Y1, XO, F 
MAC(su,uu) Al ¥1, F 
x1, YO, F 
XO, YO, F 
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Table A-32 Integer Data ALU Instructions 


DSP56100 Family 


DATA ALU OPERATION 


Operation 


IMAC X0,X0,F 
IMPY X1,X0,F 
A1,Y0,F 
B1,X0,F 
YO,X1,F 
Y1,X1,F 
Y1,X0,F 
YO,X0,F 


Table A-33 Division Instruction 


DSP56100 Family 


DATA ALU OPERATION 


Operation 


DIV X1,F 
X0,F 
Y1,F 
YO,F 
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Table A-34 Other Data ALU Instructions 


DSP56100 Family 
OPERATION 
Norm Rn,F n=[0,3] 
TST2 X1,X0,Y1,Y0 Test data registers 
ADC X,F 
Y,F 
CHKAAU Set V,N,Z according to last address 
ALU operation 

ZERO F Zero F from bit 32 to 39 
EXT F Sign extend F from bit 31 to 39 
SWAP F Swap F1 and FO 
NEGC F Negate with borrow 
ASL4 F 
ASR4 F 
ASR16 F Move A, AO arithmetic 

Table A-35 Special Instructions 

DSP56100 Family 
OPERATION 
WAIT 
STOP 
ENDDO 
RESET 
RTS 
RTI 
SWI 
DEBUG 
NOP 
A - 244 INSTRUCTION SET MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


| 


APPENDIX B 


DSP56100 BENCHMARKS 


ee ee For More Information On This Product ae 
o to: www.freescale.com 


SECTION CONTENTS 

B.1 INRROD W CMON connotea eee ere tener Gane eyes ae ee B-3 
B.2 FIRST SET OF BENCHMARKS 4.36 -544 ee soak aos fae naa s B-5 
B.2.1 PO CINUII PD Varrerertra sense rrr rsume rss toe a utcn rennin in asee ut eon temo nae a B-5 
B.2.2 Nie aliMiUltipliosea. skema cr csevery fener erer yas norris ec ateey eae B-5 
B.2.3 Fie all Wipes aaah vse nate accor ey ate men erat nea ance B-5 
B.2.4 NiReall Updales een cece as See es a me pneu? Gee eer B-6 
B.2.5 Real Correlation Or Convolution (FIR Filter) .................--. B-6 
B.2.6 Real * Complex Correlation Or Convolution (FIR Filter) ........... B-7 
Beaen COMmple xc MAU IY years ees ese eee scat te ove eee en ee area, B-7 
B.2.8 Ni Gomplex:MUNMIGIICS =... 2 3 ere eee ce oe eee ey ee B-8 
B.2.9 Complexsindaten conse snr ie cas rerunades Ce ne erat se aan B-8 
B.2.10 N Complex WUndates: oo 32.26 ee eee oe eee ee eee eee oes B-9 
B.2.11 Complex Correlation Or Convolution (Complex FIR) ............. B-9 
B.2.12 Nth Order Power Series (Heal). 2... 225545 see en eaee dae ees B-10 
Be2els 2nd Order Real Biquad IIR Filter...................--.0-0-005. B-10 
B.2.14 N Cascaded Real Biquad IIR Filters .................--...0--- B-11 
B.2.15 NiRadikee EET Bunemlese cence ee cee ee eee cs a B-13 
B.2.16 EMS Adaptive: Filteh ccc as cae ee ra eee eideg® See eer B-14 
B.2.17 Fel eT uO, Nase oe erences er gee cant anes cohen noe eee eee fear apne B-24 
B.2.18 ANP EAMiCeIRIIGh =. 2 sare a nee See oes ey ae B-25 
B.2.19 Generalillatice Fillets. soc ote eee eee see cen ee oe ee cate B-26 
B.2.20 NotinalizedEatticcsrilieh, ese eae eat ie ee as eee B-27 
B.2.21 <3) REN 227 Se abel TC) ©) ha errtenee setae eater Or ager ee rere Sea anger er pier ieee tray erare B-28 
B.2.22 [NxN][NxN] Matrix Multiply ...............2.0 0020 c eee ee eee B-29 
B.2.23 NiRointi3ks 2-DiFIR:GConvolution a somes eee ses eee ce B-30 
B.2.24 signed) 16 Bit Result’ Divide: son skank ee es B-32 
B.2.25 signed'integer Divide... 2. .4225 sees ek eee ee eee ede eee eee B-33 
B.2.26 Moltinly-<22-bibEractonSs sac cg ate ee eee ree on ee ae B-34 
B.3 SECOND SET OF BENCHMARKS: 22 2422 s8 bee eee ee ee B-35 
B.3.1 Sine Wave Generation Using Double Integration Technique ....... B-35 
Bigee Sine Wave Generation Using Second Order Oscillator ........... B-36 
B.3.3 IIR Filter Using Cascaded Transpose BIQUAD Cell .............. B-37 
B.3.4 Find the Index of a Maximum Value in an Array ..............--- B-39 
Bioeo Proportional Integrator Differentiator (PID) Algorithm ............. B-40 
B.3.6 Reed Solomon Main bOOp. sre Sts es ee ee ae B-41 
aoe N Double Precision Real Multiplies .............-....--000005- B-42 
B.3.8 Double Precision Autocorrelation ..............-.200-2ee eee B-42 
B-2 MOTOROLA 


\ For More Information On This Product, y 
Go to: www.freescale.com 


| INTRODUCTION 


B.1 INTRODUCTION 


Appendix B consists of a set of DSP Benchmarks intended to highlight the DSP56100 family performance 
in various applications, show examples of programming techniques, and provide code fragments for user 
application programs. Additional code will be put on the Dr. Bub Electronic Bulletin Board System as it be- 
comes available. The following table lists these Benchmark programs and provides an overview of the pro- 
gram’s performance. 


The assembly language source is organized into 5 columns as shown below. 


Label Opcode Operands Data Bus Data Bus Comment 
FIR MAC X1,X0,A —-X:(RO)+,X1 X:(R3)+,X0 ;Do each tap 


The Label column is used for program entry points and end of loop indication. The Opcode column indicates 
the Data ALU, Address ALU or Program Controller operation to be performed. The Operands column spec- 
ifies the operands to be used by the opcode. The Data Bus specifies an optional data transfer over the Data 
Bus and the addressing mode to be used. The Comment column is used for documentation purposes and 
does not affect the assembled code. The Opcode column must always be included in the source code. For 
each benchmark, the number of program words and instruction cycles are given. 


The following equates are used in the benchmark programs. 


page 132 k equ 0 
opt cc n equ 32 
;define section p equ 10 
AD EQU 0 mask equ 10 
BD EQU $100 image equ $40 
bd EQU $100 dividend equ .25 
C EQU $200 divisor equ 5 
Cc EQU $200 paddr equ 0 
D EQU $300 qaddr equ 4 
N EQU 100 wi equ 0 
AR EQU $300 w2 equ 10 
Al EQU $400 s equ 0 
OUTPUT EQU $500 tablebase equ 0 
output EQU $FFF1 Ipc equ 8 
INPUT EQU $501 frame equ 0 
input EQU $FFF1 cor equ $100 
WwW EQU 0 shift equ $80 ;shift constant 
Ww EQU 0 table equ $180 ;sbase address 
H EQU 0 of a-law table 
XM EQU 0 org p:$40 
state equ 0 
ntaps equ $10 
MOTOROLA B-3 


For More Information On This Product 
Go to: www.freescale.com 


NP 


INTRODUCTION 


Program Program 
Benchmark Length Length Page 
in Icyc in Words Number 
B.2.1 Real Multiply 3 3 B-4 
B.2.2 N Real Multiplies 2N 11 B-4 
B.2.3 Real Update 4 4 B-4 
B.2.4 N Real Updates 3N 14 B-5 
B.2.5 N Term Real Convolution (FIR) 1N 9 B-5 
B.2.6 N Term Real*Complex Convolution 2N 15 B-6 
B.2.7 Complex Multiply 6 6 B-6 
B.2.8 N Complex Multiplies 4N 14 B-7 
B.2.9 Complex Update 7 7 B-7 
B.2.10 N Complex Updates 6N 18 B-8 
B.2.11 N Term Complex Convolution (FIR) 4N 14 B-8 
B.2.12 Nth Order Power Series 1N 13 B-9 
B.2.13 2nd Order Real Biquad Filter 12 12 B-9 
B.2.14 N Cascaded 2nd Order Biquads 5N 23 B-10 
B.2.15 N Radix 2 FFT Butterflies 10N 13 B-12 
B.2.16 Adaptive LMS FIR 2N+19 22 B-13 
B.2.17 Flr Lattice Filter 4N+7 10 B-23 
B.2.18 All Pole lir Lattice Filter 3N+11 14 B-24 
B.2.19 General Lattice Filter 4N+12 15 B-25 
B.2.20 Normalized Lattice Filter 5N+11 15 B-26 
B.2.21 [1x3][3x3] Matrix Multiply 21 21 B-27 
B.2.22 [NxN][NxN] Matrix multiply N3+7N2 25 B-28 
B.2.23 3x3 2-D FIR Kernel 12 44 B-29 
B.2.24 Signed 16 Bit Result Divide 36 18 B-31 
B.2.25 Signed Integer Divide 32 B-32 
B.2.26 Multiply 32/48-bit Fractions 4+8 B-33 
B.3.1 Wave Generation Double Integration 2N 15 B-34 
B.3.2 Wave Generation 2nd Order Oscillator 4N 16 B-35 
B.3.3 Cascaded Transpose BIQUAD Cell 8N 15 B-36 
B.3.4 IIR nth Order Direct Form II Canonic 2N 11 B-37 
B.3.5 Find Index Of A Max Value In Array 3N 10 B-38 
B.3.6 PID Algorithm 5 5 B-39 
B.3.7 Reed Solomon Main Loop 18N 17 B-40 
B.3.8 N Double Precision Real Multiplies QN 18 B-41 
B.3.9 Double Precision Autocorrelation 19 B-41 
Table B-1 Benchmark Overview 
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B.2 FIRST SET OF BENCHMARKS 
B.2.1 Real Multiply 
;c=a*b 
: Prog _Icyc 
words Cycles 
MOVE X:(RO)+NO,X1  X:(R3)+N3,XO 51 1 
MPYR X1,X0,A 1 1 
MOVE A,X:(R1) 1 1 
Totals 3 3 
B.2.2 N Real Multiplies 
;c(l) = a(l) * b(I), I=1,...,N 
opt cc 
MOVE #AD,RO 2 2 
MOVE #BD,R3 2 2 
MOVE #C,R2 2 2 
MOVE X:(RO)+,Y0 X:(R3)+,X0 1 1 
DO #N,END_DO2 2 3 
MPYR YO,X0,A = X:(RO)+,Y0 X:(R3)+,X0 1 1 
MOVE A,X:(R2)+ 1 1 
END_DO2 : 
: Totals 11 2N+10 
B.2.3 Real Update 
;d=c+a*b 
opt cc 
MOVE X:(RO)+NO,X1  X:(R3)+N3,XO 51 1 
MOVE X:(R2),A 1 1 
MACR X1,X0,A i 1 
MOVE A,X:(R1) 1 1 
: Totals 4 4 
MOTOROLA B-5 


For More Information On This Product 
Go to: www.freescale.com 


| FIRST SET OF BENCHMARKS 


B.2.4 N Real Updates 

;d(l) = c(l) + a(l) * b(I), l=1,...,N 
opt cc 
MOVE #AD,RO 2 2 
MOVE #BD,R3 2 2 
MOVE #C,R2 2 2 
MOVE #D,R1 2 2 
MOVE X:(RO)+,Y0 X:(R3)+,X0 dl 1 
DO #N,END_DO4 2 3 
MOVE X:(R2)+,A 4 1 
MACR Y0,X0,A_—-X:(RO)+,Y0 X:(R3)+,X0 1 1 
MOVE A,X:(R1)+ 1 1 

END_DO4 5 

; Totals 14 3N+12 

B.2.5 Real Correlation Or Convolution (FIR Filter) 

;c(n) = SUM(I=0,...,N-1) {a(l) * b(n-I)} 
opt cc 
MOVE #AD,RO 2 2 
MOVE #BD,R3 2 2 
CLR A X:(RO)+,Y0 ‘4 1 
MOVE X:(R3)+,X0 l 1 
REP #N 1 2 
MAC Y0,X0,A_—-X:(RO)+,Y0 X:(R3)+,X0 al 1 
RND A Al 1 

; Totals 9 1N+9 
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B.2.6 Real * Complex Correlation Or Convolution (FIR Filter) 
‘er(n) + jci(n) = SUM(I=0,...,N-1) {(ar(l) + jai(l)) * b(n-I)} 
‘er(n) = SUM(I=0,...,N-1) {ar(I) * b(n-1)} 
:ci(n) = SUM(I=0,...,N-1) {ai(I) * b(n-1)} 
opt cc 
MOVE #AR,RO 2 2 
MOVE #AI,R1 2 2 
MOVE #BD,R3 2 2 
CLR A X:(RO)+,X1 1 
CLR B X:(R1)+,Y1 | 1 
MOVE X:(R3)+,X0 “ 1 
DO #N,END_DO6 32 3 
MAC X0,X1,A  X:(RO)+,X1 1 1 
MAG X0,Y1,B = -X:(R1)4,¥1.——-X:(R3)4,XO0. 31 1 
END_DO6 
RND A 1 1 
RND B 1 1 
Totals 15 2N+14 
B.2.7 Complex Multiply 
X memory 
0 ——| 
ai 
cr + jci = (ar + jai)*(br + jbi) 
cr = ar*br - ai*bi - 
ci = ar*bi + ai*br r3 bi 
Y1=ar X1 =br 
YO =ai X0 = bi 
2 ——> 
ci 
opt cc 
MOVE X:(RO)+,Y1 X:(R3)+,X1 1 1  arbr 
MPY Y1,X1,A = X:(RO)+,YO X:(R3)+,X0 1 1 ar*br, ai, bi 
MACR -Y0,X0,A 1 1 ar*br-ai*bi 
MPY Y1,X0,B—A,X:(R2)+ 1 1 ar*bi 
MACR Y1,X0,B 1 1 ar*bi+ai*br 
MOVE B,X:(R2)+ = 1 
; Totals 6 6 
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B.2.8 N Complex Multiplies 
: cr(l) + jci(l) = (ar(l) + jai(l)) * (br(l) + jbi(l)), I=1,...,N 
; cr(l) = ar(l) * br(l) - ai(l) * bi(l) Y1=ar X1=br 
; ci(l) = ar(l) * bi(l) + ai(l) * br(l) YO=ai X0=bi 
opt cc 
MOVE #AD,RO 2 2 
MOVE #C-1,R2 2 2 
MOVE #BD,R3 2 2 
MOVE X:(R2),B ; dummy move! 
MOVE X:(RO)+,Y1 X:(R3)+,X1 1 1 ar;br 
DO #N,END_DO8 2 3 
MPY Y1,X1,A —_-X:(R0)+,YO X:(R3)+,X0 1 1 ar*br, ai, bi 
MACR -Y0,X0,A_B,X:(R2)+ 1 1 ar*br-ai*bi 
MPY YO,X1,B)—A,X:(R2)+ 1 1 ai*br 
MACR Y1,X0,B = X:(RO)+,Y1 X:(R3)+,X1 1 1 ar*bi+ai*br, ar 
END_DO8 
MOVE B,X:(R2)+ 1 1 
Totals: 14 4N+11 
B.2.9 Complex Update 
X memory 
ro” ” 
ai 
dr + jdi = cr + jci + (ar + jai)*(br + jbi) bi 
dr = cr + ar*br - ai*bi r3 bi 
di =ci+ ar*bi + ai*br 
Y1=ar X1 =br ee 
YO = ai X0 = bi 2 A 
if = dr 
di 
opt cc 
MOVE X:(R2)+,A 1 1 ocr 
MOVE X:(RO)+,Y1 X:(R3)+,X1 1 1 
MAC Y1,X1,A — -X:(RO)+,Y0 X:(R3)+,X0 1 1 cr+ar*br,ai,bi 
MACR -Y0,X0,A = X:(R2)+,B 1 1 cr+ar*br ai*bi 
MAC Y1,X0,B)A,X:(R1)+ 1 1 citar*bi 
MACR YO,X1,B 1 1 citar*bi+ai*br 
MOVE B,X:(R1)+ 1 1 
Totals 7 7 
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B.2.10 N Complex Updates 
opt cc 
MOVE #AD,RO 2 2 
MOVE #BD,R3 2 2 
MOVE #D-1,R1 2 2 
MOVE #C,R2 2 2 
MOVE X:(R1),B ; dummy in B 
MOVE X:(RO)+,Y1 4 1 ar 
DO #N,END_DOA 2 3 
MOVE X:(R2)+,A X:(R3)+,X0 1 cr,br 
MAC Y1,X0,A X:(RO)+, YO X:(R3)+,X1 1 1 cr+ar*br, ai, bi 
MACR -YO,X1,A  B,X:(R1)+ 1 1 cr+ar*br ai*bi 
MOVE X:(R2)+,B 7 1 ci 
MPY Y1,X1,B A,X:(R1)+ 1 1 ci+ar*bi, dr 
MACR YO,X0,B X:(RO)+,Y1 1 1 ci+ar*bi+ai*br 
END DOA 
MOVE B,X:(R1)+ 1 
: Totals 18 6N+13 
B.2.11 Complex Correlation Or Convolution (Complex FIR) 
er(n) + jci(n) = SUM(I=0,...,N-1) {(ar(l) + jai(l)) * (br(n-l) + jbi(n-l))} 
er(n) = SUM(I=0,...,N-1) {ar(I) * br(n-l) - ai(l) * bi(n-I)} Yi=ar X1=br 
: ci(n) = SUM(I=0,...,N-1) {ar(l) * bi(n-l) + ai(l) * br(n-l)} YO=ai X0=bi 
opt cc 
MOVE #AD,RO 2 2 
MOVE #BD,R3 2 2 
CLR A X:(RO)+,Y1 5 1 ar 
CLR B X:(R3)+,X1 1 1 ob 
DO #N,END_DOB 2 3 
MAC Y1,X1,A X:(RO)+, YO X:(R3)+,X0 1 1 ar*br, ai, bi 
MAC Y1,X0,B 1 1 ar*bi 
MAC YO,X1,B X:(RO)+,Y1 X:(R3)+,X1 1 1 ar*bi+ai*br, ar 
MAC -Y0,X0,A 1 1. ar*br-ai*bi 
END DOB 
RND A 1 1 
RND B 1 1 
: Totals 14 4N+11 
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B.2.12 Nth Order Power Series (Real) 
c = SUM(I=0,...,N) {a(l) * b**I} = [[[a(n) *b+a(n-1)] *b+a(n-2)]*b+a(n-3)]... 


opt cc 
MOVE #BD,R1 2 2 
MOVE #AD,RO 2 2 
MOVE X:(R1),YO 1 1 b 
MOVE YO,X0 1 1 
MOVE X:(RO)+,A 1 1 a(n) 
MOVE X:(RO)+,B 1 1 a(n-1) 
DO #N/2,END_DOC 2 3 
MAC A1,Y0,B = X:(RO)+,A 1 1 a(n-2) 
MAC B1,X0,A = X:(RO)+,B 1 1 a(0)+a(1)*b 
END_DOC 
RND A 1 1 
Totals 13. 1N+12 
B.2.13 2nd Order Real Biquad IIR Filter 
: w(n)/2 = x(n)/2 - (a1/2) * w(n-1) - (a2/2) * w(n-2) 
: y(n)/2 = w(n)/2 + (b1/2) * w(n-1) + (b2/2) * w(n-2) 
: DHigh Memory Order - w(n-2), w(n-1) 
: DLow Memory Order - (a2/2), (a1/2), (b2/2), (b1/2) 
; this version uses two pointers 
opt cc 
MOVE #-1,NO 2 2 
ORI #$08,MR 1 1 
RND A X:(R3)+,X1 1 1 Xt=a2/2 
MOVE X:(RO)+,X0 1 1 X0=wn-2 
MAC Y1,X0,A = X:(RO)+NO,Y1  X:(R3)+,X1 1 1 yl=wn-1 
MAC Y1,X1,A = -X1,X:(RO)+ 1 1 a=wn 
MOVE X:(R3)+,X1 1 1 x1=b2/2 
MAC X1,X0,A —A,X:(RO)+ 1 1 
MOVE X:(R3)+,X1 1 1 Xt=b1/2 
MACR Y1,X1,A 1 1 
MOVE A,X:<<output 1 1 
Totals 12 12 
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B.2.14 N Cascaded Real Biquad IIR Filters 


: w(n)/2 = x(n)/2 - (a1/2) * w(n-1) - (a2/2) * w(n-2) 
; y(n)/2 = w(n)/2 + (b1/2) * w(n-1) + (b2/2) * w(n-2) 


: D High Memory Order - w(n-2)1,w(n-1)1,w(n-2)2,w(n-1)2.... 
: D Low Memory Order - (a2/2)1,(a1/2)1,(2/2)1,(b1/2)1,(a2/2)2,... 


: this version uses two pointers 


opt cc 
ORI #$08,MR 1 1 
MOVE #W,RO 2 2 
MOVE #C,R3 2 2 
MOVE #-1,NO 2 2 
movep xi<<input,A 1 5 
RND A X:(R3)+,X1 1 1 X1=a2/2 
MOVE X:(RO)+, YO 1 1  YO=wn-2 
DO #N,END_DOE 2 3 
MAC YO,X1,A = X:(RO)+N0,Y1-X:(R3)+,X1 1 1 yt=wn-1 
MACR Y1,X1,A = -Y1,(RO)+ 1 1 
MOVE X:(R3)+,X1 1 1 X1=b2/2 
MAC YO,X1,A  A,X:(RO)+ 1 1 
MOVE X:(R3)+,X1 1 1 X1=b1/2 
MAC Y1,X1,A = X:(RO)+,Y0 X:(R3)+,X1 1 1 
END DOE : 
: Totals 18 6N+14 
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;this version uses three pointers 


X memory 
1 —— w(n-2) 
w(n-1) 
302«O > a2/2 
ai/2 
r1.s—> b2/2 
b1/2 
opt cc 
ORI #$08,MR 2 2 
MOVE #W,RO 2 2 
MOVE #C,R3 2 2 
MOVE #C+2,R1 2 2 
MOVE #2,N3 2 2 
MOVE #4,.N1 2 2 
MOVE #-1,NO 2 2 
MOVEP X:<<input,A 1 2  ja=x 
MOVE X:(RO)+,Y0 X:(R3)+,X0 1 1 ;yO=w-2 
DO #N,END_DOF 2 3 
MAC YO,X0,A = -X:(RO)+NO,Y1. X:(R3)+N3,X0 51 1 jw-1;a1/2 
MACR Y1,X0,A = -Y1,X:(RO)+ 1 1 a=w 
MOVE X:(R1)+N1,X0X:(R3)+,X1 1 1 ;x0=b2/2 
MAC YO,X0,A = A,X:(RO)+ 1 1 ja=w+b2/2w-2 
MAC Y1,X1,A = _X:(RO)+,YO X:(R3)+,X0 1 1 ja=y; next w-2 
END_DOF 3 
; Totals 23 5N+20 
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B.2.15 N Radix 2 FFT Butterflies 
: Decimation in time (DIT), in-place algorithm 
k 
A X=A+BW r0.r2 X memory 
———s ar/xr 
ai/xi 
r3,r1 
——_ br/yr 
bi/yi 
rt. | cos(2nk/N) 
-sin(27k/N) 
wk 
\ XO X11 YO Y1 
bi br wr -Wi 
A B 
B Y=A-BW aS oa 
yi/ai/yr/ar xi/ai/xr/ar 


: Twiddle Factor Wk= wr - jwi = cos(2zk/N) -j sin(2zk/N) pointed by R1 
: which must be saved on each pass. 


; Xr = ar + wr * br - wi * bi 
: xi = ai + wi * br + wr * bi 
: yr = ar - wr * br + wi* bi = 2 * ar - xr 
: yi =al-wi* br- wr* bi = 2% ai - xi 


opt cc 

move x:(r1)+,y0 X:(13)+,x1 syO=wr; x1=br 
move x:(r0),lo ;o=ar 

move xi(r1)+n1,y1 sy1=wi 


; save r1, update r1 to point last bi/yi 


do #n,end_bfly 2 3 

mac y0,x1,b X:(r3)+,x0 1 1 b=ar+wrbr 

macr -y1,x0,b a,xi(r1)+ 1 1 b=xr 

move x:(r0)+,a 1 1 aear 

subl b,a b,x:(r2)+ 1 1 a=2ar-xr=yr 

move x:(r0),l 1 1 

move a,x:(r1)+ 1 1 b=ai 

mac y1,x1,b X:(r3)+,X1 1 1 b=ai+wibr 

macr y0,x0,b x:(r0)+,a X:(r3)+,x0 1 1 b=xi;a=ai 

subl b,a b,x:(r2)+ 1 1 a=2ai-xi=yi 

move x:(r0),b 1 1 bear 
end_bfly 

move b,x:(r1)+n14 1 1 save last yi 
; save r1, update r1 to point twiddle factors 
; Totals 13 10N+4 
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B.2.16 LMS Adaptive Filter 
x(n) T x(n-1) T x(n-K) T T x(n-N+1) 
c(0) oy ) oy” c(N¢1) 
ee oe oe 
y(n) 
din) TY All, : 

aE, 
;Notation and symbols: 
: x(n) - Input sample at time n. 
; d(n) - Desired signal at time n. 
; y(n) - FIR filter output at time n. 
; H(n) - Filter coefficient vector at time n. H={c0,c1,c2,...,ck,...,c(N-1)} 
: X(n) - Filter state variable vector at time N. X={x(n),x(n-1),....,x(n-N+1)} 
: Mu - Adaptation gain. 
: N - Number of coefficient taps in the filter. 
: True LMS Algorithm Delayed LMS Algorithm 
: Get input sample Get input sample 
: Save input sample Save input sample 
; Do FIR Do FIR 
; Get d(n), find e(n) Update coefficients 
; Update coefficients Get d(n), find e(n) 
: Output y(n) Output y(n) 
: Shift vector X Shift vector X 


: System equations: 
; e(n)=d(n)-H(n)X(n) e(n)=d(n)-H(n)X(n) (FIR filter and error) 
: H(n+1)=H(n)+uX(n)e(n) =H(n+1)=H(n)+uX(n-1)e(n-1) (Coefficient update) 


;References: 


;“Adaptive Digital Filters and Signal Analysis”, Maurice G. Bellanger Marcel Deker, 
; Inc. New York and Basel 


“The DLMS Algorithm Suitable for the Pipelined Realization of Adaptive Filters”, 
;Proc. IEEE ASSP Workshop, Academia Sinica, Beijing, 1986 


Note: 

;The sections of code shown describe how to initialize all registers, filter an input 
;sample and do the coefficient update. Only the instructions relating to the filtering 
;and coefficient update are shown as part of the benchmark. Instructions executed 
;only once (for initialization) or instructions that may be user application dependent 
sare not included in the benchmark. 
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; Implementation of the true LMS on the DSP56100 family 


;Memory map: 


opt 
move 
move 
move 
move 
movep 
move 
clr 
move 
rep 
mac 
macr 
movep 


X memory 
errs x(n) 
be x(n-1) 
x(n-N+1) 
3,12. ——> cO 
cl 
cl 
c(N-1) 
cc 
#XM,r0 ;start of X 
#N-1,m0 ;mod 4 
#-2,n0 ;adjustment for filtering 
m0,m2 smod N 
x:<<input,yO get input sample 
#H,r3 2 2 coefficients 
a yO,x:(r0)+ 1 1 save x(n) 
X:(13)+,x1 1 1 getc0d 
#N-1 1 2 dofir 
yO,x1,a x:(r0)+,yO X:(13)+,X1 1 1 
yO,x1,a 1 1 last tap 


;(Get d(n), subtract fir output, multiply by “u”, put the result in x0. 
;This section is application dependent.) 


a,x:<<output —_ ;output fir if desired 


move #H,r3 1 1 coefficients 
move r3,r2 1 1 coefficients 
move x:(r0)+,y0 1 1 get x(n) 
move X:(r3)+,a 1 1  a=c0 
do #ntaps, coefupdate 32 3 update coef. 
macr x0,y0,a x:(r0)+,y0 X:(13)+,x1 1 1 
tfr x1,a a,x:(r2)+ 1 1 copyc, 
_coefupdate 
move x:(r0)+n0,yO 1 1 update rO 
move x:(r3)-,yO 1 1 update r3 
; Totals: 18 3N+17 
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; Implementation of the delayed LMS on the DSP56100 family 


X memory 
ro” x(n) 


x(n-1) 
x(n-N+1) 
ri ~=—-> cO 
r1 ——__> cl 


cl 


o(N-1) 


; Delayed LMS algorithm with matched coefficient and data vectors 
; Algorithm runs in 2N (2 coeffs processed in each 4 cycle loop) 


; Register Usage: 

; Data Sample is stored in YO and Y1. 
: Coefficient is stored in X1 

: Loop Gain * Error is stored in XO. 

: FIR operation done in B. 

: Coeff update operation done in A. 


: FIR sum = a=a+c(k)_, .*x(n-k) 


old 


; C(K) awe b= C(K) Og “Mure ag *x(n-k-1) 
opt cc 
move #state,r0 32 2 
move #ntaps,m0 32 2 
move #-2,n0 32 2 
move #1,n1 2 2 
move #c+1,13 32 2 
move #c,r1 32 2 
clr b x:(r0)+,y0 1 yO = x(n) 
move x:(r0)+,y1 X:(r3)+,x1 1 y1=x(n-1) 
do #ntaps/2,end_Ims 2 3 
mac y0,x1,b a,x:(r1)+n14 x1,a 1 1 
macr x0,y1,a x:(r0)+,y0 X:(13)+,x1 1 1 
mac x1,y1,b a,x:(r1)+n14 x1,a 1 1 
macr y0,x0,a x:(r0)+,y1 X:(r3)+,X1 1 1 
end_Ims 
move a,xi(r1)+ 1 1 
move (rO)+n0 1 1 
; Totals: 22 2N+19 
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; Implementation of the double precision true LMS on the DSP56100 family 


; Memory map: 


opt 
move 
move 
move 
move 
move 
movep 


move 
clr 
move 
rep 
mac 
macr 
movep 


X memory 
== x(n) 
« x(n-1) 
x(n-N+1) 
r2,r3 ———> cOh 
col 
cth 
cil 
cc 
#XM,r0 
#N-1,m0 
#-2,n0 
#2,n3 
m0,m2 
Xi<<input,yO 
#H,r3 
a yO,x:(r0)+ 
X:(r3)+n3,x1 
#N-1 
x1,y0,a x:(r0)+,yO X:(r3)+n3,x1 
x1,y0,a 
a,x:<<output 


‘start of X 
;mod 4 
;adjustment for filtering 


;mod N 
;get input sample 


:coefficients 
save x(n) 
;get cO 

:do fir 

; mac; next x 
last tap 
;output fir if desired 


—| = wo - + 


(Get d(n), subtract fir output, multiply by “u’”, put the result in x0. This section is 


;application dependent.) 


move #H,r3 1 1 ;coefficients 
move r3,r2 1 1 ;coefficients 
move x:(r0)+,y0 1 1 jget x(n) 
move X:(r3)+,a 1 1 ;at1=cOh 
move X:(r3)+,a0 1 1 ;a0=col 
do #ntaps, coefupdat 32 3 jupdate coef. 
mac x0,y0,a x:(r0)+,yO 1 1 
move X:(r3)+,b 1 1 ue(n) x(n)+c 
move x:(13)+,b0 1 1 ;bO0=next c()! 
move a1,x:(r2)+ 1 1 ;save next c()h 
tfr b,a a0,x:(r2)+ 1 1 scopyc 
_coefupdat 
move x:(r0)+n0,yO 1 1 jupdate rO 
move (r3)- 1 1 jupdate r3 
move (r3)- 1 1 
; Totals: 21 6N+17 
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: Implementation of the double precision delayed LMS on the DSP56100 family 


X memory 


rs x(n) 
~ x(n-1) 


x(n-N+1) 


r.3  —— cOh 
col 
cih 
cll 


; Delayed LMS algorithm with matched coefficient and data vectors 
; Algorithm runs in 4N (2 coeffs processed in each 8 cycle loop) 

; Register Usage: 

: Data Sample is stored in YO and Y1. 

; Coefficient is stored in X1 

; Loop Gain * Error is stored in XO. 

: FIR operation done in B. 

: Coeff update operation done in A. 

; FIRsum=a=a +0(k) 1g x(n-k) 


: c(k) we b =c(k)_,, -mu*e__, *x(n-k-1) 


ne old old 
opt cc 
move #state,rO 2 2 
move #ntaps,m0 32 2 
move #-2,n0 2 2 
move #1,n1 2 2 
move #c,r3 2 2 
move #c-2,r1 2 2 
clr b x:(r0)+,y0 1 1  y0O=x(n) 
move x:(r0)+,y1 X:(13)+,x1 1 1 yt=x(n-1) x1=cOh 
do #ntaps/2,end_Ims2 2 3 
mac y0,x1,b a,x:(r1)+n14 1 1 
tfr x1,a a0,x:(r1)+n1 1 1 at=ckh 
move x:(r3)+,a0 1 1 a0=ckl 
macr x0,y1,a x:(r0)+,y0 X:(13)+,x1 1 1 xt=c(k+1)h 
mac x1,y1,b a,x:(r1)+n4 1 1 
tfr x1,a a0,x:(r1)+n1 1 1 
move x:(r3)+,a0 1 1 
macr y0,x0,a x:(r0)+,y1 X:(13)+,x1 1 1 
end_Ims2 
move a,xi(r1)+ 1 1 
move a0,x:(r1)+ 1 1 
move (rO0)+n0 1 1 
; Totals: 27 4N+20 
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The complete code for a true LMS that executes in two instruction cycles per tap is shown below. 

; A brief description of how the algorithm is derived precedes the LMS code. Note that the coefficients 
: stored to memory are saturated (should overflow occur), whereas the coefficients used in the FIR 
: filter are not 

; saturated. Therefore, the coefficients stored to memory, and the coefficients used in the FIR filter 

; calculation, 

; are not guaranteed to be the same. This should not be a problem in designs where the echo gain 

: is guaranteed 

; to be less than one. 


opt cc,cex 

page 132,66 

section FAST_LMS 
n_tap equ 16 

org x:$0100 
ref_buf dsm n_tap ;Ref_buf is a modulo n_tap buffer, containing 

;a reference signal. 

coeff ds n_tap ;Note: Coefficients are stored in reverse order 
ref_ptr dc ref_buf ;data pointer for reference buffer 
scaled_error dc 0 ;scaled error sample from last call of echo_input 
norm_factor dc 0.1 ;scale factor for error signal 

org p:0 

jmp Test_EC 

org p:$0100 


The following pseudo code is for the “standard” LMS echo canceller algorithm. 
y(n) = estimate of echo at time sample n. 
x(n) = reference input signal at time n. 
input (n) = input signal (containing echo signal) at time n. 
c(n,k) = k’th coefficient at time n. 


/* initialize N coefficients at time 0 to 0 */ 


for (k = 0 to n-1) { 


c(0,k) = 0; 
} 
/* LMS follows, do forever */ 
n=0; 
do forever { 
y(n) = 0; 
for (k=0 to N-1) { 
y(n) = y(n) + c(n,k)*x(n-k); /* FIR filter */ 
} 


error(n) = input(n) - y(n); 


for (k = 0 to N-1) { 
c(n+1,k) = c(n,k) + delta*error(n)*x(n-k) ; /* Coefficient Update */ 
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The following is equivalent to the above (i.e., given the same input signals, the error signal and 
coefficients will follow the exact same trajectories. Note the calculations are run from the back of 
the filter to the front. This saves two registers. Also note that the calculation order of the coefficient 
and the FIR filter has been reversed. 


/* initialize N coefficients at time -1 to 0 */ 
for (k = 0 to N-1) { 
c(-1,k) = 0; 
} 
error (-1) =0 ;The initial error must be set to Zero. 


/* LMS follows, do forever */ 


n=0; 
do forever { 
y(n) = 0; 
for (k = N-1 to 0) { 
c(n,k) = c(n-1,k) + delta*error(n-1)*x(n-1-k); /* Coefficient */ 
for (k= N-1to 0) { 
y(n) = y(n) + e(n,k)*x(n-k); /* FIR filter */ 
error(n) = input(n) - y(n); 
n=n+1; 
} 


/* initialize N coefficients at time -1 to 0 */ 
for (k = 0 to N-1) { 
c(-1,k) = 0; 
} 
error(-1) = 0; 


/* LMS follows, do forever */ 


n=0; 
do forever { 
y(n) = 0; 
for (k = N-1 to 0) { 
c(n,k) = c(n-1,k) + delta*error(n-1)*x(n-1-k); /* Coefficient */ 
y(n) = y(n) + c(n,k)*x(n-k); /* FIR filter */ 
} 
error(n) = input(n) - y(n); 
n=n+1; 
_—_ Dae ae acerca eer 
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Echo Canceller Routine (Fast LMS) 


; Upon Entry 
: x1 should contain newest reference sample 
: y1 should contain newest input sample 


Upon Exit 
b will contain echo cancelled output 


Note that the coefficients are stored in reverse time order. 


FAST_LMS: 
move #+1,n1 
move #n_tap-1,m0 
move #-1,m1 
move mi,m3 
move x:ref_ptr,rO ;r0 is the get reference signal pointer 
move #coeff,r3 ;r3 is the get coefficient pointer 
move r3yr4 11 is the put coefficient pointer 
move x:(r0),yO ;yO contains the oldest reference sample 
move X1,x:(r0)+ store newest reference sample in reference register 
clr b X:(r3)+,a sfetch first coefficient, and clear b for FIR 
move x:scaled_error,xO ;xO is the scaled error sample 
do #n_tap,end_fir_update 
macr x0,y0,a x:(r0)+,yO X:(13)+,X1 
mac a1,y0,b a,x:(r1)+n4 X1,a 
end_fir_update 
neg b 
move r0,x:ref_ptr ;store get reference pointer 
add y1,b 3b = EC output = input - echo_estimate 
move x:norm_factor,x0O 
move b,yO 
mpyr yO,x0,a 
move a,x:scaled_ error 
move b,x:output_port 
rts 
MOTOROLA B - 21 


For More Information On This Product 
Go to: www.freescale.com 


| FIRST SET OF BENCHMARKS 


; Test shell follows 
: Remote signal is an impulse train, period greater than echo span 
: Input is the resulting echo signal 


org x:$1000 
output_port ds 1 ;write output to D/A 
org x:$0400 
Remote_signal dc 0.8 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
org c:$0420 
echo_input dc 0.0 
dc 0.0 
dc 0.2 
dc 0.4 
dc 0.7 
dc 0.4 
dc 0.2 
dc 0.1 
dc 0.0 
dc -0.1 
dc -0.2 
dc -0.1 
dc 0.0 
dc 0.1 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
dc 0.0 
remote_get dc remote_signal 
input_get dc echo_input 
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org p 
Test_EC 
move #0 ,x0 
move #-1,m0 
move #coeff,r0 
rep #n_tap 
move x0,x:(r0)+ ;zero coefficients 
move #Sffff,xO 
do x0,end_test_loop 
move x:remote_get,r0 
move #19,m0 
move x:input_get,r1 
move #19,m1 
move X:(r0)+,x1 
move xi(r1)+,y1 
move r0,x:remote_get 
move r1,x:input_get 
jsr FAST_LMS 
end_test_loop 
nop 
nop 
debug 
endsec 
end 
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B.2.17 FIR Lattice Filter 


;Lattice filter benchmarks. N refers to the number of “k” coefficients in the lattice filter. 
;Some filters may have other coefficients other than the “k” coefficients but their 


; number may be determined from k. 


FIR LATTICE FILTER 


S1 


$2 


S3 Sx 


SINGLE SECTION X memory 
, ro? $1 
i . an t S2 
y The equations are: S3 
Sx 
t=s*t+t;tot 
s=t*k+s’ 
r1 > K1 
K2 
_|T K3 
S$ S’ 
: move #state,r0 ;point to state variable storage 
move #N,m0 ;N=number of k coefficients 
move #k,r1 ;point to k coefficients 
move #N-1,m1 ;mod for k’s 
move #0,n0 
opt cc 
movep Xi<<input,b ;get input 
move b,x:(r0)+ 1 1 save 1st state 
move x:(r1)+,x0 1 1 getk 
do #N,end_elat 2 3 
move x:(r0)+n0,a b,yO 1 1 get s;copy t 
macr x0,y0,a x:(r0)+n0,x1 1 1 t*k+s, copys 
macr x1,x0,b x:(r1)+,x0 1 1 ;s*k+t, nxt k 
move a,x:(r0)+ 1 1 jsvst 
end_elat 
move x:(r0)-,y1 1 1 
move x:(1)-,x0 1 1 
movep b,x:<<output ;output 
; 10 4N+7 
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B.2.18 All Pole IIR Lattice Filter 


ALL POLE IIR LATTICE FILTER 


> CD > eT > > > @ > 


x(n) 
K2 K1 
-K3 -K2 -K1 | 
Soa id scea 
$2 S1 
SINGLE SECTION X memory 
ri oe 
t t’ So 
The equations are: S1 
. t=ts*k;tot K1 
s'=tk+s' K2 
-K rod > K3 
: , 
Ss 
opt cc 
move #k+N-1,rO ;point to k 
move #N-1,m0 ;number of k’s-1 
move #-1,n1 
move ni,n3 
movep xXi<<input,a ;get input sample 
move #state,r3 32 2 pt to x() 
move x:(r0)-,y1 1 1 yl=k3 
move X:(13)+,x1 1 1 xt=s3 
macr -x1,y1,a x:(r0)+n0,y1 1 1 a=in-k3s3;y1=k2 
move X:(13)-,X1 1 1 x1l=s2 
do #n-1,endlat 2 3 
macr -X1,y1,a sb, x:(r3)+ 1 1 a=a-s2k2=t2;update s3 
move X:(r3)+,b a,x1 1 1 b=s2 
macr x1,y1,b x:(r0)+n0,y1 X:(r3)+n3,x1 1 1 b=s2+t2k2;get s1,k1 
endlat 
move b,x:(r3)+ 1 1 sv2ndlasts 
move x:(r0)+,y1 1 1 update r0 
move a,x:(r3)+ 1 1 save lasts 
movep a,x:<<output ; output 
: Total: 14 3N+12 
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B.2.19 General Lattice Filter 


GENERAL LATTICE FILTER 


ae ail : 
x(n) LD 
-K3 
t 
SINGLE The equations are: W3 
SECTION W2 
t=ts*k;tot W1 
< s'=t*k+s’ wo 
s output = L s’*w 
r= $3 
$2 
S1 
opt cc 
move #k,r0 ;point to coefficients 
move #2*N,m0 ;mod 2*(# of k’s)+1 
move #-1,n3 
movep xi<<input,a get input sample 
move #state,r3 32 2 ;pto filter states 
move x:(r0)+,y1 1 1 get first k 
move X:(r3)-,x1 1 1 ;firsts 
do #N,el ;2 3 ;do filter 
macr -y1,x1,a_——,,x:(r3)+ 1 1 ;t-k*s, save s 
move X:(f3)+,0 a,x 1 1 sgets again 
macr x1,y1,b x:(r0)+,y1 X:(r3)+n3,x1 1 1 ;t*k+s,get k& s 
el 
move b,x:(r3)+ 1 1 3s 2nd totst st 
clr a a,x:(r3)+ 1 1 5s first state 
move X:(13)+,Xx1 1 1 ;get last state 
rep #N 1 2 ;do fir taps 
mac y1,x1,a x:(r0)+,y1 X:(r3)+,X1 1 1 
macr y1,x1,a X:(13)+,X1 1 1 finish, adj pointer 
movep a,x:<<output ; output sample 
: Totals: 15 4N+13 
B - 26 MOTOROLA 


For More Information On This Product 
Go to: www.freescale.com 


NP 


| FIRST SET OF BENCHMARKS 


B.2.20 Normalized Lattice Filter 
(DP Py alia : C 
ae eS) WD e—&) WY e—&) ane, 
q2 qi qo 
K2 &) -K2 ‘Kl &) -Ki KO &) -KO 
q2 qi qo 
cy. T aie T C T 
+) —* a 
s2 s1 s0 
SINGLE i X memory 
SECTION q2 
t : ro” 
t 2s 
e&-D ee 
q i The equations are: qi 
k1 
K K t=t'q-s*k;tot qo 
9) 9) u’=t*k+s%*q e 
q output = & u’*w . 
< ae < &) @ ap “ 
.) DY peal wt 
u Ss u w0 
Ww r3 > Sx 
$2 
$1 
SO 
opt cc 
move #c,r0 ;point to coefficients 
move #3*N,m0 ;mod on coefficients 
move #0,n3 
movep xi<<input,a get input sample 
move #state,r3 32 2  ptto state 
move x:(r0)+,y1 a,x1 1 1 get first Q 
do #n,endnlat 2 3 
mpy x1,y1,a x:(r0)+,y0 X:(r3)+n3,x0 1 1 jq*tigetk&s 
macr -x0,yO,a__—,,X:(r3)+ 1 1 jq*t-k*s,save s 
mpy y0,x1,b a,x 1 1 ;k*t, sett’ 
macr y1,x0,b x:(r0)+,y1 1 1 ;k*t+q*s, get q 
endnlat 
move b,x:(r3)+ 1 1 ;sv scnd Ist st 
move a,x:(r3)+ 1 1 jsave state 
clr a X:(r3)+,x1 1 1 clr acc 
rep #n 1 2 ;do fir taps 
mac x1,yl,a x:(r0)+,y1 X:(r3)+,X1 1 1 
macr x1,yl1,a X:(13)+,X1 1 1 rnd, adj pointer 
movep a,x'<<output ; output sample 
: Totals: 15 5N+12 
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B.2.21 [1x3][3x3] Matrix Multiply 


X memory 
ai 
al2 
al13 


ai1 ai2 a3 b1 
a21 a22 a23] y| b2 
a31 a32 a33 b3 


opt cc 
move #AD,r3 32 2  pointto mata 
move #bd,r0 32 2 point to vec b 
move #2,m0 2 2  addrb mod 3 
move #c,r2 32 2 point to vecc 
move x:(r0)+,yO X:(r3)+,x0 1 1 y0=a11;x0=b1 
mpy yO,x0,a x:(r0)+,yO X:(r3)+,x0 1 1 al1*bi 
mac yO,x0,a x:(r0)+,yO X:(r3)+,x0 1 1 +a12*b2 
macr y0,x0,a x:(r0)+,y0 X:(r3)+,x0 1 1 +a13*b3 
move a,x:(r2)+ 1 1 store ci 
mpy yO,x0,a x:(r0)+,yO X:(r3)+,x0 1 1 a21*bi 
mac yO,x0,a x:(r0)+,yO X:(r3)+,x0 1 1 +a22*b2 
macr y0,x0,a x:(r0)+,y0 X:(r3)+,x0 1 1 +a23*b3 
move a,x:(r2)+ 1 1 store c2 
mpy yO,x0,a x:(r0)+,yO X:(r3)+,x0 1 1 a31*b1 
mac y0O,x0,a x:(r0)+,yO X:(r3)+,x0 1 1 +a32*b2 
macr y0,x0,a 1 1 +4a33*b3-— c3 
move a,x:(r2)+ 1 1 storec3 

: Totals: 21 21 
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B.2.22 


[NxN][NxN] Matrix Multiply 


;The matrix multiplications are for square NxN matrices. 


X memory 
_ eee 
ai alk aiN b11 bik biN : 
. ‘ alk 
ak1 akk akN X bk1 bkk bkN ; 
: . ak1 
aN1 aNk aNN bN1 bNk bNN . 
= = aN1 
c11 c1k c1iN 
0. | ptt 
ck1 ckk ckN 
cN1.. cNk .. cNN 
L 12 || efi 
;All the elements;are stored in “row major” format. i.e. for the array A: 
opt cc 
move #AD,r0 32 2  pointtoA 
move #bd,r3 32 2 ;point to B 
move #c,r2 2 2 ;output mat C 
move #N,b 32 2 sarray size 
move b,n3 1 1 
do #N,erows 32 3  dorows 
do #N,ecols 32 3  docolumns 
move x1,r0 1 1 copyrowA 
move r1,r3 1 1 copy colB 
clr a x:(r0)+,yO 1 1 
move X:(r3)+n3,x0 1 1 clrsum & pipe 
rep #N-1 1 2 sum 
mac yO,x0,a x:(r0)+,yO x:(r3)+n3,x0 1 1 
macr y0,x0,a X:(r3)+,y1 1 1 finish, next col 
move a,x:(r2)+ 1 1 ;save output 
ecols 
add x1,b 1 1 nextrowA 
move b,x1 1 1 
move #bd,r1 32 2 first element B 
erows 
: Total: Words: Cycles: 
: 25 ((8+(N-1))N+7)N+12) 
: N247N7+6N48 
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B.2.23 N Point 3x3 2-D FIR Convolution 


‘The two dimensional FIR uses a 3x3 coefficient mask: 


c11 ci2 c13 
c21 c22 c23 
c31 c32 c33 


;The image is an array of 512x512 pixels. To provide boundary conditions for the FIR filtering, the 
;image is surrounded by a set of zeros such that the image is actually stored as a 514x514 array. i.e. 


’ 


0 0 0 
0 512| |9 
514 
0 image 0 
area 
0 0 0 


;The image (with boundary) is stored in row major storage. The first element of the 

sarray image is image(1,1) followed by image(1,2). The last element of the first row is image(1,514) 
;followed by the beginning of the next column image(2,1). These are stored sequentially in the array 
; “im” ind memory. 


;Image(1,1) maps to index 0, image(1,514) maps to index 513, 
;Image(2,1) maps to index 514 (row major storage). 


;Although many other implementations are possible, this is a realistic type of image environment 
;where the actual size of the image may not be an exact power of 2. 

;Other possibilities include storing a 512x512image but computing only a 511x511 

sresult, computing a 512x512 result without boundary conditions but throwing away the pixels on 
;the border, etc. 


: r0 > image(n,m) image(n,m+1) image(n,m+2) 
: r1 > image(n+514,m) image(n+514,m+1 image(n+514,m+2) 
3 r2 > image(n+2*514,m) image(n+2*514,m+2) image(n+2*514,m+3) 


r3 — FIR coefficients 
b — output image 
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opt cc 
move #mask,r3 32 2 pt to coef. 
move #-8,n3 2 2 
move #image,r0 32 2 top boundary 
move #image+514,r1 2 2 left of first pixel 
move #image+2*514,r2 2 2 left of 2nd row 
move #512,y1 2 2 
move #-1,n1 32 2 adjust. 
move n1,n2 1 1 
move #output,b 32 2 output image 
move x:(r0)+,y0 1 1 yO=im(1,1) 
move X:(r3)+,x0 1 1 x0=c11 
do y1,rows 32 3 
do y1,cols 32 3 
mpy y0,x0,a x:(r0)+,y0 X:(r3)+,x0 1 1 im(1,1)*c11 
mac y0,x0,a x:(r0)+n0,yO —-x:(r3)+,x0 1 1 +im(1,2)*c12 
mac y0,x0,a x:(r1)+,y0 X:(r3)+,x0 1 1 +im(1,3)*c13 
mac y0,x0,a x:(r1)+,y0 X:(r3)+,x0 1 1 +im(2,1)*c21 
mac y0,x0,a xi(r1)+n1,yO —-x:(r3)+,x0 1 1 +im(2,2)*c22 
mac y0,x0,a x:(r2)+,y0 X:(r3)+,x0 1 1 +im(2,3)*c23 
mac y0,x0,a x:(r2)+,y0 X:(r3)+,x0 1 1 = +im(3,1)*c31 
mac y0,x0,a x:(r2)+n2,yO X:(r3)+n3,x0 1 1 = +im(3,2)*c32 
macr y0,x0,a x:(r0)+,y0 X:(r3)+,x0 1 1 +im(3,3)*c33 
move a,x:(b1) 1 2 
inc24 b 1 1 

cols 

; adjust pointers for frame boundary 
move #2,n1 32 2 
move n1,n2 1 1 
inc b x:(r0)+,x1 1 1 adj r0 
inc b xi(r1)+n1,x1 1 1 adjri 
move (r2)+n2 1 1 adjr2 
move x:(10)+,x1 1 1 preload 
move #-1,n1 32 2 jadjust. 
move n1,n2 1 1 

rows : 

: Totals: 44 12N°413N+22 

; Kernel: 12 
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B.2.24 Signed 16 Bit Result Divide 


;This is a routine for a 4 quadrant divide (i.e., a signed divisor and a signed dividend) 


;which generated a 16-bit signed quotient and a 32-bit signed remainder. The 
quotient is stored in the lower 16 bits of accumulator a, a0, and the remainder in 
the upper 16 bits a1. The true (restored) remainder is stored in b1. The original 
;dividend must occupy the low order 32 bits of the destination accumulator, a, and 
;must be a POSITIVE number. The divisor must be larger than the dividend so that a 


‘fractional quotient is generated. 


opt cc 
abs a a,b 1 1 make dividend positive 
move b,x:$0 2 2 save rem. sign in x:$0 
eor x0,b 1 1 quo. sign in N bit of CCR 
andi #$fe,ccr 1 1 clear carry bit C (quotient sign bit) 
rep #$10 1 2 form a 16-bit quotient 
div x0,a 1 1 form quot. in a0, remainder in a1 
tfr a,b 1 1 save remainder and quot. in b1,b0 
jpl savequo 1 2 go to savequo if quot. is positive 
neg b 1 1 complement quotient if N bit is set 
savequo 
tfr x0,b 1 1 get signed divisor 
move b0,x1 1 1 save quo. in x1 
abs b 1 1 get abs value of signed divisor 
add a,b 1 1 restore remainder in b1 
bftstl #$8000,x:$0 2 2 test if remainder is positive 
beq <done1 1 2 branch if positive 
move #$0,b0 1 1 prevent unwanted carry 
neg b 1 1 complement remainder 
done1 ;end of routine. 
; total 19 37 
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B.2.25 Signed Integer Divide 


;Registers usex: a,b,x0 
;Output: Quotient > a0 


opt cc 
move #dividend,a 2 2 sign ext A2 
move a2,al 1 1 andAt 
move #dividend,a0 32 2 move into A 
asl a 1 1 prep divide 
move #divisor,xO 32 2. divisor into xO 
abs a a,b 1 1 dividend pos 
andi #$fe,ccr 1 1 clr the carry 
rep #$10 1 2 16bit quotient 
div x0,a 1 1 form quot. a0 
eor x0,b 1 1 save signin N 
bpl <done2 1 2 
neg a 1 1 comp.bit is set 
done2 nop ‘finished 
: total 15 32 
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B.2.26 Multiply 32-bit Fractions 


;This routine will execute the multiplication of two 32-bit FRACTIONAL numbers that 
sare already stored in memory as follows: 


: r0 > X:$Paddr PO 
; X:$Paddr P1 

; 5 X:$Qaddr QO 
: X:$Qaddr Q1 


fine initial 32-bit numbers are: 
F P = P1:PO (16:16 bits) 
; Q = Q1:Q0 (16:16 bits) 


;The result, R, is a 64 bit number that is stored in the two 
accumulators A and B as follows: 

: R = R3:R2:R1:RO 

: = A1:A0:B1:B0 = (32:32bits) 

: = A2:A1:A0:B1:B0 (sign extended) 


opt cc 

move #paddr,r0 ;2 2 init pointer for P 

move #qaddr,r3 2 2 init pointer for Q 

nop 

move x:(r0)+,yO x:(r3)+,x0 1 1 PO0,QO 

move x:(r0)+,y1 X:(r3)+,x1 1 1 P1,Q1 

mpyuu x0,y0,a 1 1 

move a0,b0 1 1 b0=P0*Q0=RO 

dmacsu x1,y0,a 1 1  a=PO0*Q1+a1 

macsu y1,x0,a 1 1 a=a+ P1*Q0 

move a0,b1 1 1 bi=R1 

dmacss x1,y1,a 1 1 a=P1*Q1+ a1=R3:R2 
i total 4+8 4+8 
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B.3 SECOND SET OF BENCHMARKS 


B.3.1 Sine Wave Generation Using Double Integration Technique 


a= Stored initial value 
which is the desired tone 
amplitude 


x0 = 2*sin(nFs/FO) 
FO = Oscillation Frequency 
Fs = Sampling Frequency 


opt cc 
clr b 1 1 
move #$4000,a :2 2 
move #0,n1 2 2 
move #$4532,x1 2 2 
move #9111 2 2 
move x0,y0 1 1 
do y1,loop1 2 
mac x0,b1,a b,x:(r1)+n14 1 1 
mac -y0,ai,b 1 1 
loop1 
move b,x:(r1) 1 1 
; 15. 2N+14 
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B.3.2 Sine Wave Generation Using Second Order Oscillator 
a= Stored initial value 
which is the desired tone 
amplitude 
x0 
a —— 
¥ sin(w,ft) 
al =} —_- i 0 
xO = 2*cos(2mFs/FO) 
FO = Oscillation Frequency 
Fs = Sampling Frequency 
opt cc 
clr a 1 1 
move #$4000,x1 2 2 
move #$6d4b,x0 2 2 
move #9111 2 2 
move #0,n1 2 2 
do y1,loop2 32 3 
mac -x1,x0,a x1,x:(r1)+n1 1 1 
neg a 1 1 
mac x1,x0,a 1 1 
tfr x1,a a,x 1 1 
loop2 
move x1,x:(11) 1 1 
; 16. 4N+13 
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B.3.3 lIR Filter Using Cascaded Transpose BIQUAD Cell 
b0 X memory 
al . () go Ng ain diel 10) wit(n-1) 
a, F, w12(n-1) 
T 
b1 I -al 4 : 
: wi(n) -\_. |) w21(n-1) 
) amen Se: &) ° w22(n-1) 
b2 T - a2 
r3 ; 
: &) wain) ane ) . —=— | b1(0)/2 
ae, b1(1)/2 
EQUATION: we 
N * b1(2)/2 
4 2 y(n) = bO*x(n) + w1(n-1) -at(2)/2 
b0+b1z +b2z w1(n) = b1*x(n) - a1*y(n) + w2(n-1) 
H(z) = (n) = b2*x(n) - a2*y(n) ' 
Cae ee IMPLEMENTATION: 
y(n)/2 = b0/2*x(n) + w1(n-1)/2 
(n)/2 = b1/2*x(n) - a1/2*y(n) + w2(n-1)/2 
w2(n)/2 = b2/2*x(n) - a2/2*y(n) 
opt cc 
move #w1,r0 
move #w2,r1 
move #N-1,m0 
move m0,m1 
move #0,n0 
move #0,n1 
move #C,13 
ori #08,mr 
move x:(r0)+n0,b X:(r3)+,x0 1 1 b=wt1;x0=b0/2 
asr b 1 1 b=wt/2 
movep Xi<<input,yO 1 2 y0=x 
do #N,end_lp 2 3 
macr y0O,x0,b x:(r1)+n1,a x:(r3)+,x0 1 1 b=y/2;get w2,b1/2 
asr a b,y1 1 1 a=w2/2;y1=y 
mac x0,y0,a X:(r3)+,x0 1 1 a=x*b1/2+w2/2,get a1/2 
macr x0,y1,a x:(13)+,x0 1 1 a=wt/2;get b2/2 
mpy x0,y0,a a,x:(r0)+ 1 1  a=x*b2/2;save w1 
move X:(r3)+,x0 b,yO 1 1 y0=y;get a2/2 
macr y1,x0,a x:(r0)+n0,b X:(r3)+,x0 1 1 a=w2/2 
;get next w1, next b0/2 
asr b a,xi(r1)+ 1 1 b=wt/2; save w2 
end_lIp 
movep y0,x:<<output 1 2 output y 
14 8N+9 
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IIR Filter Using The Nth Order Direct Form II Canonic 


> > &) > —) Pp > oe 


w(n-2) 
fT j w(n-3) 


6 [T] 6 w(n-N) 
“3 win-2)[ &)- r3 a0 


-al 
N 
yb b0 
_ i=0 b1 
-aN [T | bN H(z) = | ae : 
< < > > a.z 
Saormrcmmaa SS ya 
i=0 
;The equation of the filter becomes: 
: wn = a0*xn - al*wn-1 - a2*wn-z........... - aN*wn-N 
: yn = bO*wn +b1*wn-1 + b2*wn-2z........... + bN*wn-N 
opt cc 
move #C,13 
move #(N*2+1), m3 
move #w,r0 
move #N,m0 
move #0,n0 
movep Xi<<input,yO 1 2  y0=xn 
clr a X:(13)+,X1 1 1 xt=al 
rep #N 1 2 
mac y0,x1,a x:(r0)+,y0 X:(13)+,x1 1 1 
macr yO,x1,a X:(13)+,X1 1 1 a=wn, x1=b0 
clr a a,x:(r0)+n0 1 1 
move x:(r0)+,y0 1 1 y0=wn 
rep #N al 2 
mac yO,x1,a x:(r0)+,yO X:(13)+,X1 1 1 
macr yO,x1,a 1 1 a=yn 
movep a,xi<<output =; 2 output y 
: 11 2N+13 filter loop 
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B.3.4 Find the Index of a Maximum Value in an Array 
opt cc 
move #AD,r0 32 2 
move #-2,n1 2 2 
clr a x:(r0)+,b 1 1 
do #N,end_|p3 2 3 
cmpm b,a b,y1 1 1 
; tle yl,a rO,r4 1 1 
move x:(r0)+,b ol 1 
end_|p3 
nop 
lea (r1)+n1,r4 1 2 
. 11 3N+10 
MOTOROLA B- 39 


For More Information On This Product 
Go to: www.freescale.com 


| SECOND SET OF BENCHMARKS 


B.3.5 Proportional Integrator Differentiator (PID) Algorithm 


} } X memory 
3 


Xx 
x(n-2) 6949 10. | x(n) 


y(n)=y(n-1) + kO x(n) + k1 x(n-1) + k2 x(n-2) 


;The PID is the most commonly used algorithm in control applications 


sy(n) = y(n-1) + kO x(n) + k1 x(n-1) + k2 x(n-2) 


opt cc 
move #k,r3 ; 
move #s8+2,1r0 ; 
move #-1,n0 : 
move #2,m0 ;r0 mod 3 
movep Xi<<input,x0 ; x(n) in x0 
move x:(r0)+,b x:(r3)+,yO 1 1 
mac x0,y0,b x:(r0)+,yO X:(13)+,X1 1 1 
mac y0,x1,b x:(r0)+,y0 X:(13)+,x1 1 1 
macr y0,x1,b x0,x:(r0)+n0 1 1 
move b,x:(r0) 1 1 
movep b,x:<<output sy(n) in b 
: 5 5 
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B.3.6 Reed Solomon Main Loop 
ALPHA 
31%) a2 &) 
ALPHA1 | ALPHA? | A 
> > CIN > ral > > 
input 
;DSP56100 family 
*ng=n1=-1 
opt cc 
do #28,loopn 32 3 
move x:(r0)+n0,y1 1 1 ;Get from interleave 
move X:(r3)4+n3,a 1 1 ;,get P4; 
eor yi,a_—_b,x:(r1)+n1 1 1 ;alpha(a) store p2 
move a,n1 1 1. ;Move ALPHA for table lookup 
move x:itablebase,b ;2 2 __ ;tableptr in b 
add b,a y1,x:(r2)+ 1 1 ;table index (a);store sample 
tfr x0,0 = x:(a1),y1 = 1 ;table entry y1;g1+base (b) 
add y1,b 1 1 ;table ptr(b) 
tfr yO,a_—-x:(b1),x1 1 1 j;alpha1(x1);g2+base(a) 
add yi,a_—X:(r3)+,b 1 1 ;table ptr(a);P3(b) 
eor X1,0 = x:(a1),y1 1 2 3p4(b),alpha2(a) 
move x:(r1)-,a 1 1 ;p2(a) 
eor y1,a___—b,x:(r3)+n3 1 1 ;p3(a), store p4 
move x:(11),b 1 1 ;p1(b) 
eor x1,b a,x:(f3)+ 1 1 ;Add ALPHA2+P2, s new P1 
move n1,x:(r1)+ 1 1 ;store p1 
loopn ; 
; 17  34+28*18 
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B.3.7 N Double Precision Real Multiplies 
opt cc 
move #AD,r0 2 2 
move #BD,r3 2 2 
move #c,r1 32 2 
move x:(r0)+,yO X:(r3)+,x0 1 1 
do #N,end_loop 2 3 
move x:(r0)+,y1 X:(13)+,x1 1 1 
mpyuu x0,y0,a 1 1 
move a0,x:(r1)+ 1 1 
dmacsu x1,y0,a 1 1 
macsu y1,x0,a 1 1 
move a0,x:(r1)+ 1 1 
dmacss y1,x1,a 1 1 
move x:(r0)+,yO X:(r3)+,x0 1 1 
move a0,x:(r1)+ 1 1 
move a,x:(r1)+ 1 1 

end_loop ; 

; 19 10*N+10 

B.3.8 Double Precision Autocorrelation 

: N: speech frame size 

; p: LPC order 

;3DSP56100 family 
opt cc 
move #cor,r1 32 2 
move #frame,r2 2 2 
do #lpc+1, loop1 32 3 
move 2,13 1 1 
clr b 1 1 
move #frame,r0 32 2 
lua (r2)+,r2 1 2 
move Ic,x1 1 1 
move #>N-(p+1),a 32 2 
add x1,a x:(r0)+,y0 Xi(r3)+,x0 51 1 
rep a 1 2 
mac yO,x0,b x:(r0)+,yO X:(r3)+,xO 51 1 
move b0,x:(r1)+ 1 1 
move b1,x:(r1)+ 1 1 

_loop1 : 

; 19 (p+1)*(N-p/2)+14(p+1) +5 


;example: N=160 ; p=8 


’ 


: DSP56100 family: 12,767 cycles at 25ns — 0.32ms (1.56% of 20ms) 
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DSP56100 


16-BIT DIGITAL SIGNAL PROCESSOR FAMILY 


This document, containing changes, additional features, further explanations, and 
clarifications, is an addendum to the original document listed below: 


Document Name: DSP56100 Family Manual 


Order Number: DSP56100FM/AD 
Revision: 0 
Change the following: 


Page A-40 - For the BFCLR instruction, under “Explanation of Example:” change the phrase 
on the last portion of the last sentence to read “clears the carry bit C in CCR because not all 
these bits were clear, and then clears the bits.” 


Page A-40 - For the BFCLR instruction, under C condition code bit definition listed under the 
title “For other destination operands:” change the definition to read: 


C — Set if all the bits specified by the mask are clear. 
Clear if not all the bits specified by the mask are clear. 


Pages A-48 (Bcc instruction), A-50 (BRA instruction), A-54 (BScc instruction), and A-56 (BSR 
instruction) under “Restrictions” remove the last item (“—Not allowed between addresses 
P:$0 and P:$40.”). 


Page A-147 - For MOVE(C) instructions using the instruction format: 


MOVE(C) X:<A1,B1>,D 
MOVE(C) S,X:<A1,B1> 


change the box that appears as: 


ea Z 
(A1) 0 
(B1) 1 


to the following: 


(Al) 1 
(B1) 0 


at (MA) MOTOROLA 
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Pages A-153, A-155, A-157, and A-159 - In the table that defines the value of W, add a second 
line as shown below: 


Reg. Ww 
read S 0 
write D 1 


Page A-232 - Change Table A-9 to the following: 


Table A-9 MOVEM Timing Summary 


MOVEM Operation + mvm Cycles Comments 
Register <> P Memory 4+ea+ap 
X Memory <> P Memory 4+ea+ax+ap 
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warranty, representation or guarantee regarding the suitability of its products for any particular purpose, nor does 
Motorola assume any liability arising out of the application or use of any product or circuit, and specifically 
disclaims any and all liability, including without limitation consequential or incidental damages. “Typical” 
parameters can and do vary in different applications. All operating parameters, including “Typical”, must be 
validated for each customer application by customer’s technical experts. Motorola does not convey any license 
under its patent rights nor the rights of others. Motorola products are not designed, intended, or authorized for use 
as components in systems intended for surgical implant into the body, or other applications intended to support or 


sustain life, or for any other application in which the failure of the Motorola product could create a situation where 
personal injury or death may occur. Should Buyer purchase or use Motorola products for any such unintended or 
unauthorized application, Buyer shall indemnify and hold Motorola and its officers, employees, subsidiaries, 
affiliates, and distributors harmless against all claims, costs, damages, and expenses, and reasonable attorney 
fees arising out of, directly or indirectly, any claim of personal injury or death associated with such unintended or 
unauthorized use, even if such claim alleges that Motorola was negligent regarding the design or manufacture of 
the part. 
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